Performance Tuning and Sizing Guidelines for Informatica® Big Data Management 10.2.2

Back Next

Big Data Streaming Sizing and Tuning Recommendations

Use Informatica Big Data Streaming mappings to collect streaming data, build the business logic for the data, and push the logic to a Spark engine for processing. The Spark engine uses Spark Streaming to process data. Streaming mapping includes streaming sources such as Kafka or JMS. The Spark engine reads the data, divides the data into micro batches, and publishes it.

Streaming mappings run continuously. When you create and run a streaming mapping, a Spark application is created on the Hadoop cluster which runs forever unless killed or cancelled through the Data Integration Service. Because a batch is triggered for every micro batch interval that is configured for the mapping, consider the following recommendations:

The processing time for each batch must remain the same over the entire duration.

The batch processing time of every batch must be less than batch interval.

Performance Tuning and Sizing Guidelines for Informatica® Big Data Management 10.2.2

Rename Saved Search

Table of Contents

Performance Tuning and Sizing Guidelines for Informatica® Big Data Management 10.2.2

Performance Tuning and Sizing Guidelines for Informatica® Big Data Management 10.2.2

Big Data Streaming Sizing and Tuning Recommendations

Big Data Streaming Sizing and Tuning Recommendations