Table of Contents

Search

  1. About the Enterprise Data Preparation Administrator Guide
  2. Introduction to Enterprise Data Preparation Administration
  3. Administration Process
  4. User Account Setup
  5. Application Configuration
  6. Roles, Privileges, and Profiles
  7. Data Asset Access and Publication Management
  8. Masking Sensitive Data
  9. Monitoring Enterprise Data Preparation
  10. Backing Up and Restoring Enterprise Data Preparation
  11. Managing the Data Lake
  12. Schedule Export, Import and Publish Activities
  13. Interactive Data Preparation Service
  14. Enterprise Data Preparation Service

Enterprise Data Preparation Administrator Guide

Enterprise Data Preparation Administrator Guide

Create Catalog Resources

Create Catalog Resources

Use Informatica Catalog Administrator to create Hive and HDFS resources for the data lake in Enterprise Data Catalog.
A resource is a repository object that represents an external data source or metadata repository. Scanners attached to a resource extract metadata from the resource and store the metadata in
Enterprise Data Catalog
.
For more information about creating resources and scanners, see "Creating a Resource" in the
Informatica Service Pack 1 Catalog Administrator Guide
.
You can use the
Enterprise Data Preparation
application to add Enterprise Data Catalog resources to the data lake. You must add at least one Hive resource, which is required to perform export, import, data preparation and publish operations in
Enterprise Data Preparation
, to the data lake.
  1. Create a Hive scanner that Enterprise Data Catalog uses to extract metadata from the Hive tables in the data lake. For more information about Hive scanner properties, see "Hive Resource Prerequisites and Connection Properties" in the
    Informatica 10.2.2 Service Pack 1 Administrator Guide
    . You must create the Hive resource with the following settings for
    Enterprise Data Preparation
    :
    • In the
      URL
      property on the
      General
      Connection Properties
      panel, specify the Fully Qualified Domain Name (FQDN) of the Hive server in the JDBC connection URL.
    • If you are using operating system profiles, the Hive user name that you specify as the value for the
      User
      property must be a Hive superuser. For more information about operating system profiles, see Using Operating System Profiles.
    • Import the relevant connectors to extract metadata from Hive sources.
  2. Create an HDFS resource that Enterprise Data Catalog uses to extract metadata from the HDFS storage in the data lake. For more information about HDFS resource properties, see "HDFS Resource Connection Properties" in the
    Informatica Service Pack 1 Catalog Administrator Guide
    .
  3. Run a scan on the resources to load metadata into the catalog.
  4. Create schedules for the resources so that
    Enterprise Data Catalog
    regularly scans the resources. As a best practice, schedule the resource scans to run during non-business hours.

Tools to complete this step:

  • Informatica Catalog Administrator