It’s hard to get good data and has derailed more than one data initiative. But with a trio of product announcements at the inaugural Data Cloud summit this week, which includes the introduction of a data tissue called Dataplex, a data exchange repository called Analytics Hub, and a data capture solution. (CDC) called Datastream – Google Cloud at least attacks the problem. A new Gartner analyst says the new offerings show a continued movement toward ease of business with customers.
Obtaining good, clean, and consistent data remains a major challenge for companies and their data analysis and AI initiatives. With data distributed across multiple databases, data warehouses, and data lakes, getting a single view of the data can be extremely difficult. In fact, according to Gartner, poor data quality costs companies an average of $ 12.8 million a year, according to Google Cloud.
To that end, Google Cloud introduced three new offerings to address the issue, starting with Datastream, its new data reproduction service, and CDC without a server.
Datastream allows customers to replicate data streams in real time, from Oracle and MySQL databases to Google Cloud services, including BigQuery, Cloud SQL, Google Cloud Storage, and Cloud Spanner. According to a chart shared with Datanami, the product, which is currently in preview, will be expanded to support additional local databases, including Db2, Postgres, MongoDB, and others.
Garner analyst Sanjeev Mohan says Datastream will compete with Google Cloud with other data integration and ETL providers, including Oracle’s Matillion, Fivetran, HVR, Striim and GoldenGate. This is a sign of the importance of these data movement products, he says.
“Will it be strong? The answer is that it depends on what the ecosystem is for customers, ”says Mohan. “Some of the new customers, like Vodaphone, who switch to GCP, I think this is a very good option. But if a customer says, I have AWS and … Google Cloud is not the only cloud, if it’s multi-cloud, it can look for a neutral product for the cloud provider because it needs to have a product where they build pipes. “
Google Cloud’s upcoming data sharing offering, called Analytics Hub, is designed to allow users to share data and statistics, including dashboards and machine learning models, securely with others inside and outside your organization. , according to the company. Google Cloud says the offering, which is not yet available in preview, but will soon be, is based on the popular and existing BigQuery sharing features.
Mohan says secure data sharing is reaching more and more companies. “The idea of sharing data is to be able not to make multiple copies of data, but to have a single copy of data and share it securely,” he says.
Meanwhile, Dataplex is billed by Google Cloud as a “smart data fabric” that can provide “an integrated analytics experience”. The offering, which is currently in preview, will allow users to “cure, protect, integrate and quickly analyze their data on a large scale,” the company says. Dataplex includes automated data quality features for data scientists, as well as artificial intelligence and machine learning features that allow companies to “spend less time struggling” with systems and more time “using data to get results. business, ”says the company.
Offering a unique view of data and analytics resources, regardless of where they sit in the cloud, is a good idea that other cloud providers are also pursuing, Mohan says. Some independent software vendors, such as Cloudera, are also pursuing it, he says. Dataplex works with a client’s assets in Google Cloud and ultimately also in other clouds, such as through Google Cloud BigQuery Omni, which today supports Azure.
“They’re adopting this hybrid, multi-cloud space,” Mohan says. “But the problem with multiple cloud is how to unify both your analytics and your data governance. You need to be able to see where the data is coming from and have a common lineage, so Dataplex is this integrated data management platform. which can be placed on top of a raw data lake, a data warehouse or even a database ”.
In general, Mohan likes where Google Cloud is headed. “I think they are starting to run with a more entrepreneurial and company-ready strategy by unifying their data history,” he tells Datanami. “Therefore, they add more functions. They are simplifying the serverless architecture. They can further reduce complexity. Their billing models are also being simplified in this process [with] pay as you go. So I think Google Cloud is starting to complete its data strategy to make its customers more cohesive and more business-friendly. “
Google Cloud Review AI with Vertex launch
Google Cloud extends BigQuery to AWS, Azure
Google Cloud introduces a number of new data analysis and management services