Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Mappings in the Hadoop Environment
  4. Mapping Sources in the Hadoop Environment
  5. Mapping Targets in the Hadoop Environment
  6. Mapping Transformations in the Hadoop Environment
  7. Processing Hierarchical Data on the Spark Engine
  8. Configuring Transformations to Process Hierarchical Data
  9. Processing Unstructured and Semi-structured Data with an Intelligent Structure Model
  10. Stateful Computing on the Spark Engine
  11. Monitoring Mappings in the Hadoop Environment
  12. Mappings in the Native Environment
  13. Profiles
  14. Native Environment Optimization
  15. Cluster Workflows
  16. Connections
  17. Data Type Reference
  18. Function Reference
  19. Parameter Reference

Big Data Management User Guide

Big Data Management User Guide

Creating a Column Profile in Informatica Analyst

Creating a Column Profile in Informatica Analyst

You can create a custom profile or default profile. When you create a custom profile, you can configure the columns, sample rows, and drill-down options. When you create a default profile, the column profile and data domain discovery runs on the entire data set with all the data domains.
  1. In the
    Discovery
    workspace, click
    Profile
    , or select
    New
    Profile
    from the header area.
    You can right-click on the data object in the
    Library
    workspace and create a profile. In this profile, the profile name, location name, and data object are extracted from the data object properties. You can create a default profile or customize the settings to create a custom profile.
    The
    New Profile
    wizard appears.
  2. The
    Single source
    option is selected by default. Click
    Next
    .
  3. In the
    Specify General Properties
    screen, enter a name and an optional description for the profile. In the Location field, select the project or folder where you want to create the profile. Click
    Next
    .
  4. In the
    Select Source
    screen, click
    Choose
    to select a data object, or click
    New
    to import a data object. Click
    Next
    .
    • In the
      Choose Data Object
      dialog box, select a data object. Click
      OK
      .
      The Properties pane displays the properties of the selected data object. The Data Preview pane displays the columns in the data object.
    • In the
      New Data Object
      dialog box, you can choose a connection, schema, table, or view to create a profile on, select a location, and create a folder to import the data object. Click
      OK
      .
  5. In the
    Select Source
    screen, select the columns that you want to run a profile on. Optionally, select
    Name
    to select all the columns. Click
    Next
    .
    All the columns are selected by default. The Analyst tool lists column properties, such as the name, data type, precision, scale, nullable, and participates in the primary key for each column.
  6. In the
    Specify Settings
    screen, choose to run a column profile, data domain discovery, or a column profile and data domain discovery. By default, column profile option is selected.
    • Choose
      Run column profile
      to run a column profile.
    • Choose
      Run data domain discovery
      to perform data domain discovery. In the
      Data domain
      pane, select the data domains that you want to discover, select a conformance criteria, and select the columns for data domain discovery in the
      Edit columns selection for data domin discovery
      dialog box.
    • Choose
      Run column profile
      and
      Run data domain discovery
      to run the column profile and data domain discovery. Select the data domain options in the
      Data domain
      pane.
      By default, the columns that you select is for column profile and data domain discovery. Click
      Edit
      to select or deselect columns for data domain discovery.
    • Choose Data, Columns, or Data and Columns to run data domain discovery on.
    • Choose a sampling option. You can choose
      All rows (complete analysis)
      ,
      Sample first
      ,
      Random sample
      , or
      Random sample (auto)
      as a sampling option in the
      Run profile on
      pane. This option applies to column profile and data domain discovery.
    • Choose a drilldown option. You can choose
      Live
      or
      Staged
      drilldown option, or you can choose
      Off
      to disable drilldown in the
      Drilldown
      pane. Optionally, click
      Select Columns
      to select columns to drill down on. You can choose to omit data type and data domain inference for columns with an approved data type or data domain.
    • Choose
      Native
      ,
      Hive (deprecated)
      , or
      Hadoop
      option as the run-time environment. If you choose the Hive or Hadoop option, click
      Choose
      to select a Hadoop connection in the
      Select a Hadoop Connection
      dialog box.
  7. Click
    Next
    .
    The
    Specify Rules and Filters
    screen opens.
  8. In the
    Specify Rules and Filters
    screen, you can perform the following tasks:
    • Create, edit, or delete a rule. You can apply existing rules to the profile.
    • Create, edit, or delete a filter.
      When you create a scorecard on this profile, you can reuse the filters that you create for the profile.
  9. Click
    Save and Finish
    to create the profile, or click
    Save and Run
    to create and run the profile.

0 COMMENTS

We’d like to hear from you!