Data Catalog
A fully managed and highly scalable data discovery and metadata management service.
New customers get $300 in free credits to spend on Google Cloud during the Free Trial. All customers get up to 1 MiB of business or ingested metadata storage and 1 million API calls, free of charge.
-
Pinpoint your data with a simple but powerful faceted-search interface
-
Sync technical metadata automatically and create schematized tags for business metadata
-
Tag sensitive data automatically, through Cloud Data Loss Prevention (DLP) integration
-
Get access immediately then scale without infrastructure to set up or manage
Benefits
Simplifies data discovery at any scale
Empower any user on the team to find or tag data with a powerful UI, built with the same search technology as Gmail, or via API access. Data Catalog is fully managed, so you can start and scale effortlessly.
Offers a unified view of all datasets
Understand your data assets in Google Cloud and beyond. Integrations with BigQuery, Pub/Sub, Cloud Storage, and many connectors provide a unified view and tagging mechanism for technical and business metadata.
Key features
Key features
Serverless
Fully managed and scalable metadata management service; requires no infrastructure to set up or manage, allowing you to focus on your business.
Metadata as a service
Metadata management service for cataloging data assets via custom APIs and the UI, thereby providing a unified view of data wherever it is.
Central catalog
A flexible and powerful cataloging system for capturing both technical metadata (automatically) as well as business metadata (tags) in a structured format.
What's new
Sign up
for Google Cloud newsletters to receive product updates,
event information, special offers, and more.
Documentation
Documentation
Overview of Data Catalog
Find out why you need a Data Catalog and how it powers the efficient use of your data.
Quickstart for tagging datasets
Make a BigQuery dataset, create a tag template with a schema, look up the Data Catalog entry for your table, and attach the tag to your table.
How to search with Data Catalog
Use Data Catalog to perform a search of data assets, such as datasets, tables, views, and Pub/Sub topics in your Google Cloud projects.
Restricting access with BigQuery column-level security
This page explains how to use BigQuery column-level security to restrict access to BigQuery data at the column level.
Access on-premises metadata connectors on GitHub
Commons code used by the Data Catalog connectors and links for the connectors sample code.
Use cases
Use cases
While you can use the Data Catalog API to create your own connectors for ingesting metadata from a data source of your choice, we provide you with “ready to use” open-source connectors for ingesting metadata from a number of common data sources like MySQL, PostgreSQL, Hive, Teradata, Oracle, SQL Server, Redshift, and more. Once in Data Catalog, all assets can be searched for and tagged.
The Data Catalog API can be used to ingest metadata from any business intelligence asset. For Looker and Tableau we have open-sourced ready-to-use connectors so they're discoverable and can be tagged directly in Data Catalog.
All features
All features
Serverless | Fully managed and scalable metadata management service; requires no infrastructure to set up or manage, allowing you to focus on your business. |
Metadata as a service | Metadata management service for cataloging data assets via custom APIs and the UI, thereby providing a unified view of data wherever it is. |
Central catalog | A flexible and powerful cataloging system for capturing both technical metadata (automatically) as well as business metadata (tags) in a structured format. |
Search and discovery | A simple and easy-to-use UI with powerful structured search capabilities to quickly and easily find data assets; powered by the same Google search technology that supports Gmail and Drive. |
Schematized metadata | Supports schematized tags (e.g., Enum, Bool, DateTime) and not just simple text tags—providing organizations rich and organized business metadata. |
Cloud DLP integration | Discovers and classifies sensitive data, providing intelligence and helping to simplify the process of governing your data. |
On-prem connectors | Ingest technical metadata from non-Google Cloud data assets to Data Catalog for a unified view of all your data assets. |
Cloud IAM integration | Provides access-level controls and honors source ACLs for read, write, and search for the data assets; giving you enterprise-grade access control. |
Governance | Offers a strong security and compliance foundation with Cloud DLP and Cloud IAM integrations. |
Pricing
Pricing
Pricing for Data Catalog is split between metadata storage
and API calls—both on a consumption basis. Metadata storage
includes any new metadata stored in Data Catalog, including:
• Business metadata, such as Data Catalog tag templates and
tags • Cloud Storage filesets schemas attached to Pub/Sub topics
• Custom types metadata stored in Data Catalog, etc.
Metadata storage does not include the technical metadata stored by other Google Cloud services, for example, dataset table and column names stored in BigQuery. Detailed pricing and examples for both metadata storage and API calls may be found in the Data Catalog documentation.