Table of Contents

  1. Preface
  2. Introduction to Informatica Edge Data Streaming
  3. Licenses
  4. Using Informatica Administrator
  5. Creating and Managing the Edge Data Streaming Service
  6. Edge Data Streaming Entity Types
  7. Edge Data Streaming Nodes
  8. Data Connections
  9. Working With Data Flows
  10. Managing the Edge Data Streaming Components
  11. Security
  12. High Availability
  13. Disaster Recovery
  14. Monitoring Edge Data Streaming Entities
  15. Troubleshooting
  16. Frequently Asked Questions
  17. Regular Expressions
  18. Command Line Program
  19. Configuring Edge Data Streaming to Work With a ZooKeeper Observer
  20. Glossary

User Guide

User Guide

Edge Data Streaming Data Flow Process

You use the Administrator tool to design the flow of data from the data source to the data target and to deploy the data flow.
The Administrator Daemon pushes the data flow configuration information to Apache ZooKeeper. The EDS Nodes download the configuration information and start the source services and target services that the configuration specifies. Source services read data in blocks and publish messages through a data connection. Target services receive the data and write the data to a data target. The EDS Node monitors the entities in the data flow and sends information about state and statistics to the Administrator Daemon. The Administrator Daemon sends this information to the Administrator tool.
For example, an application writes log data to log files in the following directory: /usr/app/logs/. You want to transfer the data contained in the log files to an HDFS cluster. To transfer the data, install EDS Nodes on the application host machine and target host machine. As part of performing post-installation tasks, start a EDS Node Node1 on the application host and a EDS Node Node2 on the target host.
The following image shows how EDS works:
The image numbers the operations in the order of occurrence. The following steps describe the sequence of operations:
  1. Use the Administrator tool to create and deploy a data flow. When you configure the data connection in the data flow, use the Ultra Messaging or a WebSockets data connection. In the data flow, create a source service. Specify the source directory as /usr/app/logs/, and map the service to Node1. Create an HDFS target service and map the target service to Node2. Connect the source service to the target service, and add any transformations that you want to apply to the data. Finally, deploy the data flow. The Administrator Daemon sends the data flow configuration information to ZooKeeper.
  2. The EDS Nodes download data flow configuration information from ZooKeeper. The EDS Node Node1 starts a source service. Similarly, Node2 starts a target service.
  3. The source service reads data from the source files and publishes that data as messages on a topic. EDS applies the transformations that you added to the data flow. The target service subscribes to the topic, receives the data, and writes it to the HDFS cluster.
  4. The EDS Node sends information about state and statistics to the Administrator Daemon. The Administrator Daemon publishes the information through the Edge Data Streaming Service. You can view the information on the Monitoring tab in the Administrator tool.

Updated March 19, 2019

Explore Informatica Network