Table of Contents

Search

  1. Preface
  2. Introduction to Microsoft Azure Data Lake Storage Gen2 Connector
  3. Connections for Microsoft Azure Data Lake Storage Gen2
  4. Mappings for Microsoft Azure Data Lake Storage Gen2
  5. Migrating a mapping
  6. Data type reference
  7. Troubleshooting

Microsoft Azure Data Lake Storage Gen2 Connector

Microsoft Azure Data Lake Storage Gen2 Connector

Microsoft Azure Data Lake Storage Gen2 targets in mappings

Microsoft Azure Data Lake Storage Gen2
targets in mappings

In a mapping, you can use a
Microsoft Azure Data Lake Storage Gen2
object as a target.
When you use
Microsoft Azure Data Lake Storage Gen2
target objects, you can select a
Microsoft Azure Data Lake Storage Gen2
Gen2 collection as target. You can configure
Microsoft Azure Data Lake Storage Gen2
target properties on the Target page of the Mapping wizard. When you write data to
Microsoft Azure Data Lake Storage Gen2
, you can use the create target field to create a target at run time. When you create a new target based on the source, you must remove all the binary fields from the field mapping.
The following table describes the
Microsoft Azure Data Lake Storage Gen2
target properties that you can configure in a Target transformation:
Property
Description
Connection
Name of the target connection. Select a target connection or click
New Parameter
to define a new parameter for the target connection.
If you want to overwrite the parameter at runtime, select the
Allow parameter to be overridden at run time
option when you create a parameter. When the task runs, the agent uses the parameters from the file that you specify in the task advanced session properties.
When you switch between a non-parameterized and a parameterized Microsoft Azure Data Lake Storage Gen2 connection, the advanced property values are retained.
Target Type
Select Single Object or Parameter.
Object
Name of the target object. You can select an existing object or create a new target at runtime.
When you select
Create New at Runtime
, enter a name for the target object and select the source fields that you want to use. By default, all source fields are used.
The target name can contain alphanumeric characters. You can use only a period (.), an underscore (_), an at the rate sign (@), a dollar sign ($), and a percentage sign (%) special characters in the file name.
Ensure that the headers or file data does not contain special characters.
You can use parameters defined in a parameter file in the target name. When you select the Create Target option, you cannot parameterize the target at runtime.
When you write data to a flat file created at runtime, the target flat file contains a blank line at the end of the file.
Parameter
Select an existing parameter for the target object or click
New Parameter
to define a new parameter for the target object.
The
Parameter
property appears only if you select Parameter as the target type.
When you parameterize the target object, specify the complete object path including the file system in the default value of the parameter.
If you want to overwrite the parameter at runtime, select the
Allow parameter to be overridden at run time
option when you create a parameter. When the task runs, the agent uses the parameters from the file that you specify in the task advanced session properties. Ensure that the parameter file is in the correct format.
Format
Specifies the file format that the
Microsoft Azure Data Lake Storage Gen2
Connector uses to write data to
Microsoft Azure Data Lake Storage Gen2
.
You can select the following file format types:
  • Flat
  • Avro
  • Parquet
  • JSON
  • ORC
Default is
None
.
If you select
None
as the format type,
Microsoft Azure Data Lake Storage Gen2
Connector writes data to
Microsoft Azure Data Lake Storage Gen2
files in binary format.
For more information, see File formatting options
Operation
The target operation. Select Insert to insert data to a
Microsoft Azure Data Lake Storage Gen2
target.
When you use the
Create Target
option and specify an object name with extension that does not match the
Format Type
under
Formatting Options
, the Secure Agent ignores the format type you specified under
Formatting Options
.
For example, if you select
Parquet
format type and specify
customer.avro
in the object name in the
Target Object
dialog box, the Secure Agent ignores Parquet and creates an Avro target file.
The following table describes the advanced target properties for
Microsoft Azure Data Lake Storage Gen2
:
Advanced Target Property
Description
Concurrent Threads
1
Number of concurrent connections to load data from the Microsoft Azure Data Lake Storage Gen2. When writing a large file, you can spawn multiple threads to process data. Configure
Block Size
to divide a large file into smaller parts.
Default is 4. Maximum is 10.
Filesystem Name Override
Overrides the default file name.
Directory Override
Microsoft Azure Data Lake Storage Gen2
directory that you use to write data. Default is root directory. The Secure Agent creates the directory if it does not exist. The directory path specified at run time overrides the path specified while creating a connection.
You can specify an absolute or a relative directory path:
  • Absolute path - The Secure Agent searches this directory path in the specified file system.
    Example of absolute path:
    Dir1/Dir2
  • Relative path - The Secure Agent searches this directory path in the native directory path of the object.
    Example of relative path:
    /Dir1/Dir2
    When you use the relative path, the imported object path is added to the file path used during the metadata fetch at runtime.
Do not specify a root directory (
/
) to override the directory.
File Name Override
Target object. Select the file from which you want to write data. The file specified at run time overrides the file specified in Object.
Write Strategy
Applicable to complex and flat files.
When you create a mapping, you can use the overwrite and append write strategy for flat files. However, you can use only the overwrite strategy for complex files.
When you create a mapping in advanced mode, you can use the overwrite and append write strategy for both flat files and complex files.
When you create a new target at runtime and use the append strategy, the mapping creates a new target file and writes the data to the file. The mapping appends data in subsequent runs.
When you append data for mappings in advanced mode, the data is appended as a new part file in the existing target directory.
The maximum size of data that you can append is 450 MB.
Default is overwrite.
Block Size
1
Applicable to flat, Avro, and Parquet file formats. Divides a large file into smaller specified block size. When you write a large file, divide the file into smaller parts and configure concurrent connections to spawn the required number of threads to process data in parallel.
Specify an integer value for the block size.
Default value in bytes is 8388608.
Compression Format
Compresses and writes data to the target based on the format you specify.
Select one of the following options:
  • None
    . Select to write Avro, ORC, and Parquet files that use Snappy compression.
    You cannot write compressed JSON files.
  • Gzip
    . Select to write flat files and Parquet files that use Gzip compression.
When the task runs, the file extensions
.gz
or
.snappy
do not appear in target object name.
Timeout Interval
Not applicable.
Interim Directory
1
Optional. Applicable to flat files and JSON files.
Path to the staging directory in the Secure Agent machine.
Specify the staging directory where you want to stage the files when you write data to
Microsoft Azure Data Lake Storage Gen2
. Ensure that the directory has sufficient space and you have write permissions to the directory.
Default staging directory is
/tmp
.
You cannot specify an interim directory for mappings in advanced mode.
You cannot specify an interim directory when you use the Hosted Agent.
Forward Rejected Rows
1
Configure the transformation to either pass rejected rows to the next transformation or drop them.
1
Doesn't apply to mappings in advanced mode.

0 COMMENTS

We’d like to hear from you!