This page has been machine-translated and may contain inaccuracies in phrasing or product terminology. If discrepancies exist, the original Japanese version takes precedence.
This document explains the core structure of the QDIC catalog and the definition of assets, which are the main content managed in the catalog. To efficiently operate and utilize a data catalog, it is essential to understand how assets and metadata are structured in the catalog, and how tags and custom categories relate to assets. By understanding these, data searches in the catalog become smoother, enabling consistent use of the data catalog across the organization.
Intended Audience for this Guide
Catalog Structure
The following diagram shows the structure of the catalog. Users access the catalog from the workspace to view the registered information. (Custom categories are not components of the catalog, but are displayed within the catalog for convenience.)
Services
The types of data sources described below are called services and represent the products or services supported by QDIC. For example, if the data source is Snowflake's data warehouse service, the service name will be Snowflake, and it will be displayed on the home screen along with the total number of assets when you log in to the workspace.
There are cases where operators create original data sources from files, and in this case, the service name will be Others.
| Service Name | Product or Service | Data Type | Remarks |
| Alteryx | Alteryx | ETL | *1 |
| Athena | Amazon Athena | DB | |
| Azure Synapse | Azure Data Services | DB | |
| BigQuery | Google BigQuery | DB | |
| Databricks | Databricks | DB | |
| Denodo | Denodo Virtual DataPort | DB | |
| Fabric Warehouse | Microsoft Fabric Warehouse | DB | |
| MySQL | MySQL | DB | |
| Oracle | Oracle Database | DB | |
| PostgreSQL | PostgreSQL | DB | |
| Power BI | Microsoft Power BI | BI | |
| Redshift | Amazon Redshift | DB | |
| Snowflake | Snowflake | DB | |
| SQL Server | Microsoft SQL Server | DB | |
| Impala | Apache Impala | DB | *1 |
| Tableau | Tableau Cloud | BI | |
| Teradata | Teradata Vantage | DB | |
| Treasure Data | Treasure Data | DB | |
| Others | Created independently by the operator from a file, not related to a specific product or service. The asset hierarchy is often designed similar to database products. | ||
*1 For cataloging metadata of Alteryx or Apache Impala, please contact our sales representative or customer success team.
Assets
Among the technical metadata obtained from services, metadata that shows the logical structure of data, such as schema, table, column, dashboard, sheet, etc., is specifically called assets. The objects managed in the catalog and published to users are assets. The following information is associated with assets.
Metadata
The following information related to assets is collectively called metadata.
- Technical metadata obtained from the user's organization services, including lineage and statistical information.
- Basic asset metadata (owner, creation date, update date, etc.) updated by operators or QDIC.
- Business metadata updated by operators (logical name, summary, tags, etc.)
- Custom properties defined independently by users
Properties
Attribute information of assets, called custom properties, which users can define and set values for according to their needs. Properties are also a type of metadata.
Lineage
Metadata that shows genealogical information for tracking the flow and transformation of data between assets, which can be confirmed by referencing assets from the catalog. This information visualizes upstream and downstream relationships, such as which tables a view is generated from, or which reports use a table or view. In the diagram above, the thick blue line in the catalog represents lineage.
Lineage is technical metadata obtained from the user's organization services.
Statistical Information
Metadata that shows the characteristics and trends of data calculated by statistical computation, which can be confirmed by referencing assets from the catalog. Statistical information is used as one basis for data analysis and decision-making. It is also used for checking data quality, such as detecting outliers from history.
Lineage is technical metadata obtained from the user's organization services where it is calculated.
Tags
Labels used to classify and organize assets. In other words, tags are assigned to assets. Tags are grouped and have a parent-child hierarchy.
Tags are also business metadata for assets.
Custom Categories
Groups of assets organized based on the tags assigned to them. Users can access all assets included in a custom category collectively from the workspace. (Strictly speaking, custom categories are not catalog components.)
Tenant and Catalog Structure Managed by Operators
Below is the structure of tenants and catalogs managed by operators or administrators. This is provided for reference, but in most cases, users do not need to be aware of it.
Data Sources
The sources supplying assets are managed as units called data sources. In most cases, data sources are created by host for each product or service connection, but users cannot access data source information.
Depending on how data sources are created and how metadata is updated in the catalog, data sources are classified as follows. The type of data source is determined by administrators and operators.
agent data source
A data source where assets and metadata are directly updated from the destination host via QDIC's program (connector).
csv data source
A data source created by the operator using a file.
Asset Groups
Groups of assets created by operators. Asset groups can include assets belonging to multiple data sources. In most cases, assets that users can reference are limited to those belonging to asset groups created by operators. Asset groups are not catalog settings, so users cannot access asset group information.
Relationship Between Data Structure and Assets
The following diagram shows the flow by which metadata (schema, table, column, dashboard, sheet, etc.) representing the logical structure of data existing in the products and services of the user's organization is stored as assets in the catalog.
The concepts and names for metadata representing the logical structure of data vary by service, but within the catalog, they are categorized under the same concept regardless of service type. For example, a logical table created by joining tables in a database is generally called a view in databases, but in the catalog, views are also treated as tables.
For more details on the relationship between data structure and assets, see the following: Relationship Between Data Source Data Structure and Assets