Update the default search query to exclude specific entities from the metadata load.
Example 1: Excluding HDFS Entities in a Specific Directory
Your Cloudera distribution contains a temporary user named "test." When you view the HDFS directory
in Cloudera Navigator, you see that all of the files owned by the test user write to the directory
. Therefore, you do not want Metadata Manager to extract HDFS entities in directory
or its subdirectories.
To prevent Metadata Manager from extracting the entities, append the
file path to the search query as follows:
NOT ((fileSystemPath:*\/.cloudera_manager_hive_metastore_canary*) OR (fileSystemPath:\/hbase\/oldWALs*) OR (fileSystemPath:\/hbase\/WALs*) OR (fileSystemPath:\/tmp\/logs*) OR (fileSystemPath:\/user\/history\/done*) OR (fileSystemPath:\/tmp\/hive-cloudera*) OR (fileSystemPath:\/tmp\/hive-hive*) OR (fileSystemPath:*\/.Trash*)
OR (fileSystemPath:*\/user\/test*)
Example 2: Excluding Job Executions
To prevent Metadata Manager from loading YARN, Oozie, and MapReduce job executions and all Sqoop job templates and executions, update the default search query as follows:
NOT ((fileSystemPath:*\/.cloudera_manager_hive_metastore_canary*) OR (fileSystemPath:\/hbase\/oldWALs*) OR (fileSystemPath:\/hbase\/WALs*) OR (fileSystemPath:\/tmp\/logs*) OR (fileSystemPath:\/user\/history\/done*) OR (fileSystemPath:\/tmp\/hive-cloudera*) OR (fileSystemPath:\/tmp\/hive-hive*) OR (fileSystemPath:*\/.Trash*))
AND NOT (((sourceType:YARN OR sourceType:OOZIE OR sourceType:MAPREDUCE) AND type:OPERATION_EXECUTION) OR sourceType:SQOOP)