Microsoft Azure Data Lake Storage Gen2 Connector

Back Next

Microsoft Azure Data Lake Storage Gen2 targets in mappings

Microsoft Azure Data Lake Storage Gen2
targets in mappings

In a mapping, you can use a

Microsoft Azure Data Lake Storage Gen2

object as a target.

When you use

Microsoft Azure Data Lake Storage Gen2

target objects, you can select a

Microsoft Azure Data Lake Storage Gen2

Gen2 collection as target. You can configure

Microsoft Azure Data Lake Storage Gen2

target properties on the Target page of the Mapping wizard. When you write data to

Microsoft Azure Data Lake Storage Gen2

, you can use the create target field to create a target at run time. When you create a new target based on the source, you must remove all the binary fields from the field mapping.

The following table describes the

Microsoft Azure Data Lake Storage Gen2

target properties that you can configure in a Target transformation:

Property	Description
Connection	Name of the target connection. Select a target connection or click New Parameter to define a new parameter for the target connection. If you want to overwrite the parameter at runtime, select the Allow parameter to be overridden at run time option when you create a parameter. When the task runs, the agent uses the parameters from the file that you specify in the task advanced session properties. When you switch between a non-parameterized and a parameterized Microsoft Azure Data Lake Storage Gen2 connection, the advanced property values are retained.
Target Type	Select Single Object or Parameter.
Object	Name of the target object. You can select an existing object or create a new target at runtime. When you select Create New at Runtime , enter a name for the target object and select the source fields that you want to use. By default, all source fields are used. The target name can contain alphanumeric characters. You can use only a period (.), an underscore (_), an at the rate sign (@), a dollar sign ($), and a percentage sign (%) special characters in the file name. Ensure that the headers or file data does not contain special characters. You can use parameters defined in a parameter file in the target name. When you select the Create Target option, you cannot parameterize the target at runtime. When you write data to a flat file created at runtime, the target flat file contains a blank line at the end of the file.
Parameter	Select an existing parameter for the target object or click New Parameter to define a new parameter for the target object. The Parameter property appears only if you select Parameter as the target type. When you parameterize the target object, specify the complete object path including the file system in the default value of the parameter. If you want to overwrite the parameter at runtime, select the Allow parameter to be overridden at run time option when you create a parameter. When the task runs, the agent uses the parameters from the file that you specify in the task advanced session properties. Ensure that the parameter file is in the correct format.
Format	Specifies the file format that the Microsoft Azure Data Lake Storage Gen2 Connector uses to write data to Microsoft Azure Data Lake Storage Gen2 . You can select the following file format types: Flat Avro Parquet JSON ORC Default is None . If you select None as the format type, Microsoft Azure Data Lake Storage Gen2 Connector writes data to Microsoft Azure Data Lake Storage Gen2 files in binary format. For more information, see File formatting options
Operation	The target operation. Select Insert to insert data to a Microsoft Azure Data Lake Storage Gen2 target.

When you use the

Create Target

option and specify an object name with extension that does not match the

Format Type

under

Formatting Options

, the Secure Agent ignores the format type you specified under

Formatting Options

For example, if you select

Parquet

format type and specify

customer.avro

in the object name in the

Target Object

dialog box, the Secure Agent ignores Parquet and creates an Avro target file.

The following table describes the advanced target properties for

Microsoft Azure Data Lake Storage Gen2

Advanced Target Property	Description
Concurrent Threads¹	Number of concurrent connections to load data from the Microsoft Azure Data Lake Storage Gen2. When writing a large file, you can spawn multiple threads to process data. Configure Block Size to divide a large file into smaller parts. Default is 4. Maximum is 10.
Filesystem Name Override	Overrides the default file name.
Directory Override	Microsoft Azure Data Lake Storage Gen2 directory that you use to write data. Default is root directory. The Secure Agent creates the directory if it does not exist. The directory path specified at run time overrides the path specified while creating a connection. You can specify an absolute or a relative directory path: Absolute path - The Secure Agent searches this directory path in the specified file system. Example of absolute path: Dir1/Dir2 Relative path - The Secure Agent searches this directory path in the native directory path of the object. Example of relative path: /Dir1/Dir2 When you use the relative path, the imported object path is added to the file path used during the metadata fetch at runtime. Do not specify a root directory ( / ) to override the directory.
File Name Override	Target object. Select the file from which you want to write data. The file specified at run time overrides the file specified in Object.
Write Strategy	Applicable to complex and flat files. When you create a mapping, you can use the overwrite and append write strategy for flat files. However, you can use only the overwrite strategy for complex files. When you create a mapping in advanced mode, you can use the overwrite and append write strategy for both flat files and complex files. When you create a new target at runtime and use the append strategy, the mapping creates a new target file and writes the data to the file. The mapping appends data in subsequent runs. When you append data for mappings in advanced mode, the data is appended as a new part file in the existing target directory. The maximum size of data that you can append is 450 MB. Default is overwrite.
Block Size¹	Applicable to flat, Avro, and Parquet file formats. Divides a large file into smaller specified block size. When you write a large file, divide the file into smaller parts and configure concurrent connections to spawn the required number of threads to process data in parallel. Specify an integer value for the block size. Default value in bytes is 8388608.
Compression Format	Compresses and writes data to the target based on the format you specify. Select one of the following options: None . Select to write Avro, ORC, and Parquet files that use Snappy compression. You cannot write compressed JSON files. Gzip . Select to write flat files and Parquet files that use Gzip compression. When the task runs, the file extensions .gz or .snappy do not appear in target object name.
Timeout Interval	Not applicable.
Interim Directory¹	Optional. Applicable to flat files and JSON files. Path to the staging directory in the Secure Agent machine. Specify the staging directory where you want to stage the files when you write data to Microsoft Azure Data Lake Storage Gen2 . Ensure that the directory has sufficient space and you have write permissions to the directory. Default staging directory is /tmp . You cannot specify an interim directory for mappings in advanced mode. You cannot specify an interim directory when you use the Hosted Agent.
Forward Rejected Rows¹	Configure the transformation to either pass rejected rows to the next transformation or drop them.
¹Doesn't apply to mappings in advanced mode.