Sizing Guidelines and Performance Tuning for Big Data Streaming 10.2.1

Sizing Guidelines and Performance Tuning for Big Data Streaming 10.2.1

Overview

Overview

Use Informatica Big Data Streaming mappings to collect streaming data, build the business logic for the data, and push the logic to a Spark engine for processing. The Spark engine uses Spark Streaming to process data. The Spark engine reads the data, divides the data into micro batches, and publishes it.
Streaming mappings run continuously. When you create and run a streaming mapping, a Spark application is created on the Hadoop cluster which runs forever unless killed or cancelled through the Data Integration Service.
To optimize the performance of Big Data Streaming and your system, perform the following tasks:
  • Determine your hardware requirement.
  • Tune the Spark engine.
  • Tune the mapping.
  • Tune the Kafka cluster.

0 COMMENTS

We’d like to hear from you!