Table of Contents

Search

  1. About the Enterprise Data Preparation Administrator Guide
  2. Introduction to Enterprise Data Preparation Administration
  3. Getting Started
  4. Administration Process
  5. User Account Setup
  6. Search Configuration
  7. Roles, Privileges, and Profiles
  8. Data Asset Access and Publication Management
  9. Masking Sensitive Data
  10. Monitoring Enterprise Data Preparation
  11. Backing Up and Restoring Enterprise Data Preparation
  12. Managing the Data Lakehouse
  13. Schedule Export, Import and Publish Activities
  14. Interactive Data Preparation Service
  15. Enterprise Data Preparation Service

Enterprise Data Preparation Administrator Guide

Enterprise Data Preparation Administrator Guide

Create Catalog Resources

Create Catalog Resources

Use Informatica Catalog Administrator to create Hive, HDFS, Microsoft Azure SQL Server, and Microsoft Azure SQL Data Warehouse resources in Enterprise Data Catalog.
A resource represents a data source from which scanners extract metadata for use in the data lake, or a data lake target to which users upload and publish data assets. Scanners attached to a resource extract metadata from the resource and store the metadata in
Enterprise Data Catalog
.
For more information about creating resources and scanners, see "Creating a Resource" in the
Informatica Catalog Administrator Guide
.
  1. Create a Hive resource that Enterprise Data Catalog uses to extract metadata from the Hive tables in the data lake, and users access to upload or publish data to Hive. Configure the Hive resource with the following settings:
    • In the
      URL
      property on the
      General
      Connection Properties
      panel, specify the Fully Qualified Domain Name (FQDN) of the Hive server in the JDBC connection URL.
    • If you are using operating system profiles, the Hive user name that you specify as the value for the
      User
      property must be a Hive superuser. For more information about operating system profiles, see Using Operating System Profiles.
    • Import the relevant connectors to extract metadata from Hive sources.
    For more information about Hive scanner properties, see "Hive Resource Prerequisites and Connection Properties" in the
    Informatica Administrator Guide
    .
  2. Create an HDFS resource for each HDFS location users access to import, upload or publish data in the data lake.
    Select the
    Recursive Scan
    property for each HDFS resource users can access to publish data to the data lake. An error occurs if the property is not selected when a user publishes data.
    For more information about HDFS resource properties, see "HDFS Resource Connection Properties" in the
    Informatica Catalog Administrator Guide
    .
  3. Create a Microsoft Azure SQL Server or Data Warehouse resource for each Microsoft Azure SQL Server or Data Warehouse location. Configure the resource with the following settings:
    • In the
      General
      General Properties
      panel, enter the generic details and specify the resource type according to the Microsoft Azure SQL Server or Data Warehouse.
    • In the
      General
      Connection Properties
      panel, specify the user name, password, host name, port number, database, and instance.
    • In the
      Metadata Load Settings
      panel, specify the schema and set the
      Import the system objects
      option.
  4. Run a scan on the resources to load metadata into the catalog.
  5. Create schedules for the resources so that
    Enterprise Data Catalog
    regularly scans the resources. As a best practice, schedule the resource scans to run during non-business hours.

Tools to complete this step:

  • Informatica Catalog Administrator