Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for HDFS
  3. PowerExchange for HDFS Configuration
  4. HDFS Connections
  5. HDFS Data Objects
  6. HDFS Data Extraction
  7. HDFS Data Load
  8. HDFS Mappings
  9. Appendix A: Data Type Reference

PowerExchange for HDFS User Guide

PowerExchange for HDFS User Guide

Creating a Complex File Data Object

Creating a Complex File Data Object

Create a complex file data object to read data from or write data to HDFS or View File System (ViewFS).
  1. Select a project or folder in the
    Object Explorer
    view.
  2. Click
    File
    New
    Data Object
    .
  3. Select
    Complex File Data Object
    and click
    Next
    .
    The
    New Complex File Data Object
    dialog box appears.
  4. Optionally, enter a name for the data object.
  5. Click
    Browse
    next to the
    Location
    option and select the target project or folder.
  6. In the
    Resource Format
    list, select any of the following formats:
    • Intelligent Structure Model: to read any format that an intelligent structure parses.
    • Binary: to read any resource format.
    • Avro: to read an Avro resource.
    • Parquet: to read a Parquet resource.
    • JSON: to read a JSON resource.
    • Orc: to read an Orc resource.
    • XML: to read an XML resource.
    Intelligent structure model is supported only in Spark mode.
  7. In the
    Access Type
    list, select
    Connection
    or
    File
    .
    • Select
      Connection
      to access a file on HDFS or ViewFS. Click
      Browse
      next to the
      Connection
      option and select an HDFS or ViewFS connection. Click
      Add
      next to the
      Selected Resource
      option to add a resource to the data object. If a default Metadata Access Service is not set, a message appears to configure the Metadata Access Service. Click
      OK
      and set one Metadata Access Service as default. After you set a default Metadata Access Service, the
      Add Resource
      dialog box appears. If the Metadata Access Service does not exist, contact the Informatica administrator to create a new Metadata Access Service in the domain. Navigate or search for the resources to add to the data object and click
      OK
      .
    • Select
      File
      to access a file on your local system. Click
      Browse
      next to the
      Resource Location
      option and select the file that you want to add. Click
      Fetch
      . The selected file is added to the
      Selected Resources
      list.
    To use an intelligent structure model, for the
    Selected Resource
    option, browse to and select the appropriate
    .amodel
    file.
  8. From the
    Available OS Profiles
    list, select an operating system profile. You can use the
    Available OS Profiles
    to increase security and to isolate the design-time user environment when you import and preview metadata from a Hadoop cluster.
    The Developer tool displays the
    Available OS Profiles
    list only if the Metadata Access Service is enabled to use operating system profiles. The Metadata Access Service imports the metadata with the default operating system profile assigned to the user. You can change the operating system profile from the list of available operating system profiles.
  9. Click
    Finish
    .
    The data object appears under the Physical Data Objects category in the project or folder in the
    Object Explorer
    view. A read and write operation is created for the data object. Depending on whether you want to use the complex file data object as a source or target, you can edit the read or write operation properties. You can also create multiple read and write operations for a complex file data object. For a data object with an intelligent structure model, create a read operation. You cannot use a write transformation for a data object with an intelligent structure model in a mapping.
    The complex file data object write operation goes through and the mapping runs successfully even if you have unconnected ports for required fields in the Parquet resource type. The NULL values are inserted in the target object when such a mapping runs. The complex file data object read operation results in an error while reading NULL values from the Parquet resource as Parquet Example Object Model does not support NULL read.
  10. For a read operation with an intelligent structure model, specify the path to the input file. In the
    Data Object Operations
    panel, select the
    Advanced
    tab. In the
    File path
    field, specify the path to the input file.

0 COMMENTS

We’d like to hear from you!