Preface
Introduction to Informatica Edge Data Streaming
- Informatica Edge Data Streaming Overview
- Edge Data Streaming Architecture
- Data Flow Model
- Edge Data Streaming High-Level Process
- Edge Data Streaming Data Flow Process
- Edge Data Streaming User Interface
- Example
Licenses
- Licenses Overview
- Viewing the License Details
- Updating a License
- Removing a License
Using Informatica Administrator
- Using Informatica Administrator Overview
- Manage Tab - Domain View
- Manage Tab - Services and Nodes View
  - Domain
  - Application Services
- Logs Tab
- Security Tab
- Managing Your Account
- Logging In
- Password Management
  - Changing Your Password
- Managing Users and Groups
  - Default Administrator
- Managing Users
  - Creating Users
  - Unlocking a User Account
- Managing Groups
  - Adding a Native Group
- Managing Privileges
  - Domain Privileges
- Roles
- Managing Roles
- Usage Collection Policy
  - Disabling Informatica Data Usage
Creating and Managing the Edge Data Streaming Service
- Creating and Managing the Edge Data Streaming Service Overview
- Creating the Edge Data Streaming Service
  - Creating the Edge Data Streaming Service in the Administrator Tool
  - Creating the Edge Data Streaming Service using Informatica Command Line Program
- Editing the Edge Data Streaming Service
Edge Data Streaming Entity Types
- Edge Data Streaming Entity Types Overview
- Aggregators
  - Aggregator Properties
- Built-in Source Service Types
- Built-in Target Service Types
- Built-in Transformation Types
- Using Parameters in Entity Properties
  - Setting Values for Parameters
  - Examples
- Custom Entity Types
- Advanced Configuration for Entities
- Configuring High Availability for Entities
Edge Data Streaming Nodes
- Edge Data Streaming Nodes Overview
- Node Groups
- Node Group Management Tab
- Working with Node Groups
Data Connections
- Data Connections Overview
- Ultra Messaging Data Connection
- WebSocket Data Connection
  - Configuring an External Load Balancer
  - WebSocket Data Connection Properties
Working With Data Flows
- Working With Data Flows Overview
- Types of Data Flows
- Creating a Data Flow
- Data Flow Design Tab
- Adding Entities to a Data Flow
- Edge Data Streaming Node Mapping
- Deploying a Data Flow
- Undeploying a Data Flow
- Undeploying and Deploying all Data Flows
  - Undeploying All Data Flows
  - Deploying All Data Flows
- Editing Data Flows and Entities
- Cloning a Data Flow
- Removing Data Flows and Entities
- Verifying Entity Properties
- Configuring Targets with Data Connection and Target Service Properties
  - Getting the Topic Name Assigned to a Connection
  - Getting the Receiver Type ID of a Target Service
- Getting Entity Alerts
Managing the Edge Data Streaming Components
- Managing the Edge Data Streaming Components Overview
- Administrator Daemon Management
- Edge Data Streaming Node Management
- Managing the Informatica Domain
Security
- Security Overview
- Authentication
- Component Security
- Secure Communication Within the Components
- Secure Data Storage
  - Update Encryption Keys
  - Updating Security Keys
- Secure Source Services and Target Services
- Privileges and Roles
High Availability
- High Availability Overview
- Restart and Failover
- Resilience
- Configuring High Availability in Edge Data Streaming
  - Design-Time High Availability
  - Run-Time High Availability
    - Entity High Availability
Disaster Recovery
- Disaster Recovery Overview
- Step 1: Replicate the EDS Installation
- Step 2: Back Up Data Flows
- Step 3: Back Up The Node Groups
- Step 4: Set Parameters
- Step 5: Replicate Source Files and Position Files
- Step 6: Restore EDS from the Disaster Recovery Site
Monitoring Edge Data Streaming Entities
- Monitoring Edge Data Streaming Entities Overview
- Viewing the Monitoring Tab
- Monitoring Tab Layout
  - System View
  - Grid View
- Edge Data Streaming Statistics
Appendix A: Troubleshooting
- Troubleshooting Licenses
- Troubleshooting Edge Data Streaming Node Issues
- Troubleshooting Administrator Daemon Issues
- Troubleshooting the Administrator Tool
- Troubleshooting Apache ZooKeeper
- Troubleshooting Component Connectivity Issues
- Troubleshooting Edge Data Streaming High Availability
- Troubleshooting Data Flows
- Troubleshooting Entities
- Troubleshooting Monitoring Tab Views
Appendix B: Frequently Asked Questions
- Frequently Asked Questions About Edge Data Streaming
Appendix C: Regular Expressions
Appendix D: Command Line Program
- Command Line Program Overview
- infacmd eds Plugin
- Running Commands
- infacmd Return Codes
- infacmd eds Command Reference
Appendix E: Configuring Edge Data Streaming to Work With a ZooKeeper Observer
Appendix F: Glossary
- Administrator Daemon
- data flow
- Informatica Administrator
- receiver type ID
- source service
- target service
- topic resolution domain
- unicast topic resolution daemon (LBMRD)
- EDS Node

User Guide

2.4.0
- 2.5.0

Back Next

HDFS Target Service

Use a Hadoop Distributed File System (HDFS) target service to write data to HDFS. To create an HDFS target service, use the HDFS target service type. You can configure the target service for target file rollover. You can also perform advanced configurations to avoid data loss in high availability and load balancing deployments.

If HDFS is Kerberos enabled, create the

hdfs

super user principal. Ensure that the Hadoop users have a Kerberos principal or keytab to get the Kerberos credentials that are required to access the cluster and use the Hadoop services.

Before you deploy a data flow that uses HDFS target services, perform the following tasks:

Install the HDFS distribution that you want and set an environment variable that EDS can use to find the client libraries for the HDFS distribution. Verify that you have the client libraries installed in the path where the

EDS Node

is running.

For example, if you have a Cloudera Distribution, download the libraries from the Cloudera Downloads page.

Set an environment variable on each host on which an HDFS target service runs. The environment variable must point to the Hadoop base directory. The environment variable is of the form

HADOOPBASEDIR=<Hadoop_Home_Directory>

. For example,

HADOOPBASEDIR=/usr/hadoop-2.0.2-alpha

The

Retry on Failure

and

Number of Retries

properties are not applicable for the HDFS target service.

For more information about product requirements and supported platforms, see the Product Availability Matrix on Informatica Network: https://network.informatica.com/community/informatica-network/product-availability-matrices

Built-in Target Service Types

Target File Rollover

HDFS Target Service Properties

Configuring the HDFS Target Service Type to Work With a High Availability HDFS

Advanced Configuration for Entities

Download Guide

Watch

Comments

Communities

Knowledge Base

Success Portal