Table of Contents

Search

  1. Preface
  2. Using Metadata Manager
  3. Configuring Metadata Manager
  4. Application Resources
  5. Business Glossary Resources
  6. Business Intelligence Resources
  7. Custom Resources
  8. Data Integration Resources
  9. Data Modeling Resources
  10. Database Management Resources
  11. Universal Resources
  12. Managing Resources
  13. Loading and Monitoring Resources
  14. Managing Permissions
  15. Resource Migration
  16. Repository Migration
  17. Appendix A: Metadata Manager Login
  18. Appendix B: Metadata Manager Properties Files
  19. Appendix C: Resource Configuration Files
  20. Appendix D: Glossary

Metadata Manager Administrator Guide

Metadata Manager Administrator Guide

Differences between Cloudera Navigator and Metadata Manager Lineage Diagrams

Differences between Cloudera Navigator and Metadata Manager Lineage Diagrams

When you compare data lineage diagrams between Metadata Manager and Cloudera Navigator, the diagrams might display different sequences of objects.
The Metadata Manager and Cloudera Navigator data lineage diagrams can contain different sequences of objects in the following circumstances:
When an entity that you view has different relationships with other entities.
Cloudera Navigator displays data flow, logical-physical, and control flow relationships in data lineage diagrams. Metadata Manager displays data flow relationships in data lineage diagrams. Metadata Manager does not display logical-physical or control flow relationships in data lineage diagrams.
When you run data lineage on a template.
Generally, Cloudera Navigator shows data flow relationships at the template, or operation, level. Operation executions might override or add data flow.
Metadata Manager generally shows data flow for physical objects, such as HDFS files and directories, and for operation executions. Metadata Manager does not show data flow for most template types. Therefore, when you run data lineage on Oozie, Pig, Sqoop, or YARN job templates, you do not see data flow. To see the data flow, run data lineage on one of the template executions.
Note that Metadata Manager does show data flow for Hive and Impala query templates instead of the query executions. Metadata Manager shows lineage for the query templates because data lineage is identical across all executions of the query.
When you run data lineage on a Hive or Impala query template.
When you run data lineage on a Hive or Impala query template that retrieves information from or creates a Hive table, Cloudera Navigator does not display the HDFS entity that is linked to the Hive table. Metadata Manager does display the HDFS entity in the data lineage diagram.
For example, a Hive query template contains a SELECT statement to retrieve information from a Hive table. In the data lineage diagram, Metadata Manager displays the HDFS entity that is linked to the Hive table upstream of the Hive table. If a Hive query template contains a CREATE statement to create a Hive table, Metadata Manager displays the HDFS entity that is linked to the Hive table downstream of the Hive table.
When you run data lineage on a YARN job execution.
Cloudera Navigator displays data flow relationships between YARN job executions and Hive query templates. Because Hive queries and tables are relational views of HDFS files, Metadata Manager shows data flow between the YARN job execution and the HDFS files.

0 COMMENTS

We’d like to hear from you!