With the advent of big data technologies, organizations are looking to derive maximum benefit from the velocity of data, capturing it as it becomes available, processing it, and responding to events in real time. By adding real-time streaming capabilities, organizations can leverage the lower latency to create a complete, up-to-date view of customers, deliver real-time operational intelligence to customers, improve fraud detection, reduce security risk, improve physical asset management, improve total customer experience, and generally improve their decision-making processes by orders of magnitude.
In 10.1.1, Informatica introduces Intelligent Streaming, a new product to help IT derive maximum value from real-time queues by streaming data, processing it, and extracting meaningful business value in near real time. Customers can process diverse data types and from non-traditional sources, such as website log file data, sensor data, message bus data, and machine data, in flight and with high degrees of accuracy.
Intelligent Streaming is built as a capability extension of Informatica's Intelligent Data Platform and provides the following benefits for IT:
- Create and run streaming (continuous-processing) mappings.
- Collect events from real-time queues such as Apache Kafka and JMS.
- Transform the data, create business rules for the transformed data, detect real-time patterns, and drive automated responses or alerts.
- Provide management and monitoring capabilities of streams at runtime.
- Provide at-least-once delivery guarantees.
- Granulate lifecycle controls based on number of rows processed or time of execution.
- Reuse and maintain event processing logic, including batch mappings (after some modifications).
Intelligent Streaming has the following features:
- Capture and Transport Stream Data
You can stream the following types of data from sources such as Kafka or JMS, in JSON, XML, or Avro formats:
- Application and infrastructure log data
- Change data capture (CDC) from relational databases
- Clickstreams from web servers
- Social media event streams
- Time-series data from IoT devices
- Message bus data
- Programmable logic controller (PLC) data
- Point of sale data from devices
In addition, Informatica customers can leverage Informatica's Vibe Data Stream (licensed separately) to collect and ingest data in real time, for example, data from sensors, and machine logs, to a Kafka queue. Intelligent Streaming can then process this data.
- Refine, Enrich, Analyze, and Process Stream Data
- Use the underlying processing platform to run the following complex data transformations in real time without coding or scripting:
- Window Transformation for Streaming use cases with the option of sliding and tumbling windows.
- Filter, Expression, Union, Router, Aggregate, Joiner, Lookup, Java, and Sorter transformations can now be used with Streaming mappings and are executed on Spark Streaming.
- Lookup transformations can be used with Flat file, HDFS, Sqoop, and Hive.
- Publish Data
- You can stream data to different types of targets, such as Kafka, HDFS, NoSQL databases, and enterprise messaging systems.
Intelligent Streaming is built on the Informatica Big Data Platform platform and extends the platform to provide streaming capabilities. Intelligent Streaming uses Spark Streaming to process streamed data. It uses YARN to manage the resources on a Spark cluster more efficiently and uses third-parties distributions to connect to and push job processing to a Hadoop environment.
Use Informatica Developer (the Developer tool) to create streaming mappings. Use the Hadoop run-time environment and the Spark engine to run the mapping. You can configure high availability to run the streaming mappings on the Hadoop cluster.
For more information about Intelligent Streaming, see the
Informatica Intelligent Streaming User Guide.