Connecting to an HDFS Cluster from Informatica Vibe Data Stream for Machine Data 2.1.0

Back Next

Adding the Source Service and Target Service to the Data Flow

After you create the data flow, add the source service and target service to the data flow.

In the

Data Flows

pane, click the data flow to which you want to add a source.

From the

Entity Types

pane, drag the

File

source service to the

Data Flow Designer

pane.

The

New Source

dialog box appears.

Specify the properties of the source service and click

From the

Entity Types

pane, drag the

HDFS

target service to the

Data Flow Designer

pane.

The

New Target

dialog box appears.

Configure the following properties for the HDFS target service type:

Property	Description
Entity Name	Name of the HDFS target service. Maximum length is 32 characters.
Destination	URI of the target file to which to write data. The HDFS target service type supports the following URI formats: HDFS URI format hdfs://<namenode-name>[:<port>]/<path>/<file-name> Where namenode-name is the host name or IP address of the HDFS NameNode. port is the port number on which the HDFS NameNode listens for connections. You can omit the port number if you have configured HDFS to listen for connections on the default port, 8020. path and file-name represent the location of the target file in the target file system. The URI format is suitable for a standalone HDFS target service. The URI is also suitable for an HDFS target service that runs on a node that is not part of a high-availability setup. To use multiple target service instances for load balancing or high availability, use variables in the URI. The destination URI must be the same URI that you used to verify the connection to the HDFS cluster.
Rollover Count	Limit for the number of files that can exist on the target at any particular time. Default is 1024.
Rollover Size	Target file size, in gigabytes (GB), at which to trigger rollover. Default is 1. A value of zero (0) means that the HDFS target service does not perform rollover based on size.
Rollover Time	Length of time, in hours, to keep a target file active. After the time period has elapsed, the target service rolls the file over. Default is 0. A value of zero (0) means that the HDFS target service does not perform rollover based on time.
Force Synchronization	Flush the client's buffer to the disk device every 1 second. If you enable forceful synchronization, data written by the target service is visible to other readers immediately. Forceful synchronization degrades the performance of VDS . For more information about forceful synchronization, see the Hadoop documentation. Default behavior is to not synchronize forcefully.
UM XML Configuration	Specify the UM configurations that the target service uses. Maximum length is 1000 characters.

Rename Saved Search

Table of Contents

Connecting to an HDFS Cluster from Informatica Vibe Data Stream for Machine Data 2.1.0

Connecting to an HDFS Cluster from Informatica Vibe Data Stream for Machine Data 2.1.0

Adding the Source Service and Target Service to the Data Flow

Adding the Source Service and Target Service to the Data Flow