Table of Contents

Search

  1. Preface
  2. Using Metadata Manager
  3. Configuring Metadata Manager
  4. Application Resources
  5. Business Glossary Resources
  6. Business Intelligence Resources
  7. Custom Resources
  8. Data Integration Resources
  9. Data Modeling Resources
  10. Database Management Resources
  11. Universal Resources
  12. Managing Resources
  13. Loading and Monitoring Resources
  14. Managing Permissions
  15. Resource Migration
  16. Repository Migration
  17. Appendix A: Metadata Manager Login
  18. Appendix B: Metadata Manager Properties Files
  19. Appendix C: Resource Configuration Files
  20. Appendix D: Glossary

Metadata Manager Administrator Guide

Metadata Manager Administrator Guide

How Metadata Manager Displays Entity Relationships

How Metadata Manager Displays Entity Relationships

The metadata component of Cloudera Navigator displays different types of entity relationships in data lineage diagrams. Metadata Manager does not display all of these relationships in data lineage diagrams. Metadata Manager displays entity relationships differently based on the relationship type.
Metadata Manager displays the following Cloudera entity relationship types in different ways:
Data flow relationships
A data flow relationship defines how data flows between metadata objects. For example, a Hive query uses an INSERT OVERWRITE TABLE statement to load data into a Hive table. Because data flows from the Hive query to the Hive table, a data flow relationship exists between the query and the table.
Cloudera Navigator displays data flow relationships in data lineage diagrams as solid arrows. Metadata Manager displays data flow relationships as lineage links in data lineage diagrams and as related catalog objects in the metadata catalog.
Logical-physical relationships
A logical-physical relationship indicates that a logical object is based on an actual, physical entity. For example, a Hive table is a logical view of a physical HDFS entity.
Cloudera Navigator displays logical-physical relationships in data lineage diagrams as solid lines without arrow heads. Metadata Manager displays logical-physical relationships as related catalog objects in the metadata catalog.
Instance relationships
An instance relationship defines a single occurrence of an operation. For example, an Oozie job execution is an instance of an Oozie job template.
Cloudera Navigator displays instance relationships for query and job templates on a separate tab in the data lineage diagram. Metadata Manager displays instance relationships as related catalog objects in the metadata catalog.
Control flow relationships
A control flow relationship places constraints or conditions on the flow of data. For example, a Hive query can contain constraints in the WHERE clause. Or, the JOIN clause in a Hive query might include a Hive table from which no data is extracted.
Cloudera Navigator displays control flow relationships in data lineage diagrams as dashed lines. Metadata Manager ignores control flow relationships.
For example, your Hadoop cluster contains a Hive products table with price and cost columns. It also contains a Hive query template with the following query:
SELECT AVG(price - cost) AS profit FROM products JOIN order_details ON (order_details.prod_id = products.prod_id) JOIN orders ON (order_details.order_id = orders.order_id) WHERE YEAR(order_date) = 2014 AND MONTH(order_date) = 12 AND price >= 500
In this query, the SELECT statement indicates that data flows from the products table to the Hive query. The JOIN clauses include two Hive tables, order_details and orders, from which no data is extracted.
Cloudera Navigator shows data flow from the products table to the Hive query. It also shows control flow relationships between the order_details table and the Hive query and between the orders table and the Hive query.
Metadata Manager also shows data flow from the products table to the Hive query. However, Metadata Manager does not show any relationship between the order_details or orders tables and the Hive query.
To view the relationship type between entities in a Cloudera Navigator data lineage diagram, download and view the lineage JSON file.

0 COMMENTS

We’d like to hear from you!