Table of Contents

Search

  1. Abstract for Profiling Sizing Guidelines
  2. Supported Versions
  3. Profiling and Discovery Sizing Guidelines

Profiling and Discovery Sizing Guidelines

Profiling and Discovery Sizing Guidelines

Scaling the Run-time Environment for Profiling Service Module

Scaling the Run-time Environment for Profiling Service Module

The Profiling Service Module is a multithreaded module in the Data Integration Service. When the Profiling Service Module scales, it uses the additional memory, temporary disk, and CPU cores that you make available to the machine.
The Profiling Service Module scales in the following ways based on the profile job type and the Profiling Service Module configuration:
Type
Description
Scaling Up
Scaling up is the ability of the Profiling Service Module to perform the same task proportionately faster after you add more resources to a single machine.
  • If you double the resources, the amount of time the profile job takes reduces by half.
  • If you double the number of CPU cores, you must add additional memory and temporary disk space.
Scaling Out
Scaling out indicates the ability of the Profiling Service Module to scale when adding more machines to it. Similar to scaling up, doubling the number of machines halves the amount of time required to run a profile job.
The Profiling Service Module scales out in the following ways:
  • Informatica grid. Run the Data Integration Service on an Informatica grid. The Profile Service Module submits the profiling mappings in such a way that it evenly distributes the workload across the grid.
  • Hadoop cluster. The Profiling Service Module runs the profile jobs on a Hadoop cluster. The Hadoop cluster is a distributed and highly scalable run-time environment to run profiling jobs. To run a profiling job faster, you can add nodes to the Hadoop cluster.
Pushdown
In the pushdown method, the Profiling Service Module sends a part of the profile job to the relational database for processing in the form of an SQL query. The Profiling Service Module processes the query results.
If the relational database has sufficient processing power, you can allocate more resources for the Data Integration Service where the Profiling Service Module runs. You can add more resources to a single node or add more machines to a grid. If the performance limitation is the processing power of the relational database, the machine running the relational database requires more resources.

0 COMMENTS

We’d like to hear from you!