Preface
Analyst Service
- Analyst Service Overview
- Analyst Service Architecture
- Configuration Prerequisites
- Recycle and Disable the Analyst Service
- Properties for the Analyst Service
- Custom Images in the Analyst Tool
- Process Properties for the Analyst Service
- Creating and Configuring the Analyst Service
- Creating an Analyst Service
Catalog Service
- Overview
  - Associated Services
- Catalog Service Privileges
- Creating a Catalog Service
  - Configuring the Catalog Service for Azure HDInsight
Content Management Service
- Content Management Service Overview
- Master Content Management Service
- Content Management Service Architecture
- Content Management Service and High Availability
  - Updating the Content Management Service Master Status
- Probabilistic Models and Classifier Models
- Reference Data Warehouse
  - Orphaned Reference Data
  - Deleting Orphaned Tables
- Recycling and Disabling the Content Management Service
- Content Management Service Properties
- Content Management Service Process Properties
- Creating a Content Management Service
Data Integration Service
- Data Integration Service Overview
- Before You Create the Data Integration Service
- Creating a Data Integration Service
- Data Integration Service Properties
- Data Integration Service Process Properties
- Data Integration Service Compute Properties
  - Execution Options
  - Environment Variables
- Operating System Profiles for the Data Integration Service
- High Availability for the Data Integration Service
Data Integration Service Architecture
- Data Integration Service Architecture Overview
- Data Integration Service Connectivity
- Data Integration Service Components
- Service Components
- Compute Component
- Process Where DTM Instances Run
- Single Node
- Grid
- Logs
Data Integration Service Management
- Data Integration Service Management Overview
- Enable and Disable Data Integration Services and Processes
  - Enable, Disable, or Recycle the Data Integration Service
    - Enabling, Disabling, or Recycling the Service
  - Enable or Disable a Data Integration Service Process
    - Enabling or Disabling a Service Process
- Directories for Data Integration Service Files
- Run Jobs in Separate Processes
  - DTM Process Pool Management
  - Rules and Guidelines when Jobs Run in Separate Processes
- Maintain Connection Pools
- PowerExchange Connection Pools
- Maximize Parallelism for Mappings and Profiles
- Result Set Caching
- Data Object Caching
- Persisting Virtual Data in Temporary Tables
- Content Management for the Profiling Warehouse
- Web Service Security Management
  - HTTP Client Filter
- Pass-through Security
  - Pass-Through Security with Data Object Caching
  - Adding Pass-Through Security
Data Integration Service Grid
- Data Integration Service Grid Overview
  - Grid Configuration by Job Type
- Before You Configure a Data Integration Service Grid
- Grid for Jobs that Run in the Service Process
- Grid for Jobs that Run in Local Mode
- Grid for Jobs that Run in Remote Mode
- Grid and Content Management Service
- Maximum Number of Concurrent Jobs on a Grid
- Editing a Grid
- Deleting a Grid
- Troubleshooting a Grid
Data Integration Service REST API
- Data Integration Service REST API Overview
- Accessing the REST API Documentation
- Using the REST API
- Queries
- Rules and Guidelines
Data Integration Service Applications
- Data Integration Service Applications Overview
  - Applications View
- Applications
- Logical Data Objects
- Physical Data Objects
- Mappings
- SQL Data Services
- Web Services
- Workflows
Enterprise Data Preparation Service
- Enterprise Data Preparation Service Overview
- Before You Create the Enterprise Data Preparation Service
- Creating and Managing the Enterprise Data Preparation Service
- Enterprise Data Preparation Service Properties
- Enterprise Data Preparation Service Process Properties
Interactive Data Preparation Service
- Interactive Data Preparation Service Overview
- Before You Create the Interactive Data Preparation Service
- Creating and Managing the Interactive Data Preparation Service
- Interactive Data Preparation Service Properties
- Interactive Data Preparation Service Process Properties
  - HTTP Configuration Options
  - Advanced Options
- Configuring Interactive Data Preparation Service on Grid for Scalability
Informatica Cluster Service
- Overview
  - Informatica Cluster Service Workflow
  - Creating an Informatica Cluster Service
Mass Ingestion Service
- Mass Ingestion Service Overview
- Creating a Mass Ingestion Service
- Enable, Disable, or Recycle the Mass Ingestion Service
  - Enabling the Mass Ingestion Service
  - Disabling or Recycling the Mass Ingestion Service
- Mass Ingestion Service Properties
- Mass Ingestion Service Process Properties
Metadata Access Service
- Metadata Access Service Overview
- Metadata Access Service Architecture
- Metadata Access Service Properties
- Metadata Access Service Process Properties
- High Availability for the Metadata Access Service
  - Metadata Access Service Restart and Failover
- Operating System Profiles for the Metadata Access Service
  - Operating System Profile Components
  - Configuring the Metadata Access Service to Use Operating System Profiles
    - Configuring System Permissions for the Operating System Profile User
    - Enabling the Metadata Access Service to Use Operating System Profiles
- Enable and Disable Metadata Access Services and Processes
  - Enable Disable or Recycle the Metadata Access Service
    - Enabling, Disabling, or Recycling the Service
  - Enable or Disable a Metadata Access Service Process
    - Enabling or Disabling a Service Process
- Creating a Metadata Access Service
- Logs
Metadata Manager Service
- Metadata Manager Service Overview
- Configuring a Metadata Manager Service
- Creating a Metadata Manager Service
- Creating and Deleting Repository Content
- Enabling and Disabling the Metadata Manager Service
- Metadata Manager Service Properties
- Configuring the Associated PowerCenter Integration Service
  - Privileges for the Associated PowerCenter Integration Service User
Model Repository Service
- Model Repository Service Overview
- Monitoring Model Repository
- Model Repository Architecture
- Model Repository Database Requirements
- Enable and Disable Model Repository Services and Processes
  - Enable, Disable, or Recycle the Model Repository Service
    - Enabling, Disabling, or Recycling the Service
  - Enable or Disable a Model Repository Service Process
    - Enabling or Disabling a Service Process
- Properties for the Model Repository Service
- Properties for the Model Repository Service Process
  - Node Properties for the Model Repository Service Process
- High Availability for the Model Repository Service
  - Model Repository Service Restart and Failover
- Model Repository Service Management
- Version Control for the Model Repository Service
- Repository Object Administration
  - Objects View
  - Locked Object Administration
- Creating a Model Repository Service
- Configuring Monitoring Model Repository Service
PowerCenter Integration Service
- PowerCenter Integration Service Overview
- Creating a PowerCenter Integration Service
- Enabling and Disabling PowerCenter Integration Services and Processes
  - Enabling or Disabling a PowerCenter Integration Service Process
  - Enabling or Disabling the PowerCenter Integration Service
- Operating Mode
- PowerCenter Integration Service Properties
- Operating System Profiles for the PowerCenter Integration Service
- Associated Repository for the PowerCenter Integration Service
- PowerCenter Integration Service Processes
- Configuration for the PowerCenter Integration Service Grid
- Load Balancer for the PowerCenter Integration Service
PowerCenter Integration Service Architecture
- PowerCenter Integration Service Architecture Overview
- PowerCenter Integration Service Connectivity
- PowerCenter Integration Service Process
- Load Balancer
- Data Transformation Manager (DTM) Process
- Processing Threads
  - Thread Types
  - Pipeline Partitioning
- DTM Processing
- Grids
  - Workflow on a Grid
  - Session on a Grid
- System Resources
- Code Pages and Data Movement Modes
  - ASCII Data Movement Mode
  - Unicode Data Movement Mode
- Output Files and Caches
High Availability for the PowerCenter Integration Service
- High Availability for the PowerCenter Integration Service Overview
- Resilience
  - PowerCenter Integration Service Client Resilience
  - External Component Resilience
- Restart and Failover
- Recovery
- PowerCenter Integration Service Failover and Recovery Configuration
PowerCenter Repository Service
- PowerCenter Repository Service Overview
- Creating a Database for the PowerCenter Repository
- Creating the PowerCenter Repository Service
- PowerCenter Repository Service Properties
- PowerCenter Repository Service Process Properties
  - Custom Properties for the PowerCenter Repository Service Process
  - Environment Variables
- High Availability for the PowerCenter Repository Service
PowerCenter Repository Management
- PowerCenter Repository Management Overview
- PowerCenter Repository Service and Service Processes
  - Enabling and Disabling a PowerCenter Repository Service
    - Enabling a PowerCenter Repository Service
    - Disabling a PowerCenter Repository Service
  - Enabling and Disabling PowerCenter Repository Service Processes
    - Enabling a PowerCenter Repository Service Process
    - Disabling a PowerCenter Repository Service Process
- Operating Mode
  - Running a PowerCenter Repository Service in Exclusive Mode
  - Running a PowerCenter Repository Service in Normal Mode
- PowerCenter Repository Content
- Enabling Version Control
- Managing a Repository Domain
- Managing User Connections and Locks
- Sending Repository Notifications
- Backing Up and Restoring the PowerCenter Repository
- Copying Content from Another Repository
- Repository Plug-in Registration
  - Registering a Repository Plug-in
  - Unregistering a Repository Plug-in
- Audit Trails
- Repository Performance Tuning
  - Repository Statistics
  - Repository Copy, Back Up, and Restore Processes
PowerExchange Listener Service
- PowerExchange Listener Service Overview
- DBMOVER Statements for the Listener Service
- Creating a Listener Service
- Listener Service Properties
- Editing Listener Service Properties
  - Editing Listener Service General Properties
  - Editing Listener Service Configuration Properties
- Enabling, Disabling, and Restarting the Listener Service
- Listener Service Logs
- Listener Service Restart and Failover
PowerExchange Logger Service
- PowerExchange Logger Service Overview
- Configuration Statements for the Logger Service
- Creating a Logger Service
- Properties of the PowerExchange Logger Service
  - PowerExchange Logger Service General Properties
  - PowerExchange Logger Service Configuration Properties
- Logger Service Management
- Enabling, Disabling, and Restarting the Logger Service
- Logger Service Logs
- Logger Service Restart and Failover
SAP BW Service
- SAP BW Service Overview
- Creating the SAP BW Service
- Enabling and Disabling the SAP BW Service
  - Enabling the SAP BW Service
  - Disabling the SAP BW Service
- Configuring the SAP BW Service Properties
  - General Properties
  - SAP BW Service Properties
- Configuring the Associated Integration Service
- Configuring the SAP BW Service Processes
- Load Balancing for the SAP BW System and the SAP BW Service
- Viewing Log Events
Search Service
- Search Service Overview
- Search Service Architecture
- Search Index
  - Extraction Interval
- Search Request Process
- Search Service Properties
- Search Service Process Properties
- Creating a Search Service
- Enabling the Search Service
- Recycling and Disabling the Search Service
System Services
- System Services Overview
- Email Service
- Resource Manager Service
- REST Operations Hub Service
  - REST Operations Hub Service Properties
    - General Properties
  - REST Operations Hub Service Process Properties
- Enabling and Disabling the REST Operations Hub Service
- Scheduler Service
Test Data Manager Service
- Test Data Manager Service Overview
- Test Data Manager Service Dependencies
- Test Data Manager Service Properties
- Database Connection Strings
- Configuring the Test Data Manager Service
- Creating the Test Data Manager Service
- Enabling and Disabling the Test Data Manager Service
- Editing the Test Data Manager Service
- Deleting the Test Data Manager Service
Test Data Warehouse Service
- Test Data Warehouse Service Overview
- Test Data Warehouse Services Dependencies
- Test Data Warehouse Service Properties
- Creating the Test Data Warehouse Service
- Process Properties for the Test Data Warehouse Service
Web Services Hub
- Web Services Hub Overview
- Creating a Web Services Hub
- Enabling and Disabling the Web Services Hub
- Web Services Hub Properties
- Configuring the Associated Repository
  - Adding an Associated Repository
  - Editing an Associated Repository
Application Service Upgrade
- Application Service Upgrade Overview
  - Privileges to Upgrade Services
  - Service Upgrade from Previous Versions
- Running the Service Upgrade Wizard
- Verify the Model Repository Service Upgrade
  - Object Dependency Graph
Appendix A: Application Service Databases
- Application Service Databases Overview
- Set Up Database User Accounts
- Data Object Cache Database Requirements
- Exception Management Audit Database Requirements
- Metadata Manager Repository Database Requirements
- Model Repository Database Requirements
- PowerCenter Repository Database Requirements
- Profiling Warehouse Requirements
- Reference Data Warehouse Requirements
- Workflow Database Requirements
- Configure Native Connectivity on Service Machines
  - Install Database Client Software
  - Configure Database Client Environment Variables
Appendix B: Connecting to Databases from Windows
- Connecting to an IBM DB2 Universal Database from Windows
  - Configuring Native Connectivity
- Connecting to an Informix Database from Windows
  - Configuring ODBC Connectivity
- Connecting to Microsoft Access and Microsoft Excel from Windows
  - Configuring ODBC Connectivity
- Connecting to a Microsoft SQL Server Database from Windows
  - Configuring Native Connectivity
    - Rules and Guidelines for Microsoft SQL Server
  - Configuring Custom Properties for Microsoft SQL Server
- Connecting to a Netezza Database from Windows
  - Configuring ODBC Connectivity
- Connecting to an Oracle Database from Windows
  - Configuring Native Connectivity
- Connecting to a Sybase ASE Database from Windows
  - Configuring Native Connectivity
- Connecting to a Teradata Database from Windows
  - Configuring ODBC Connectivity
Appendix C: Connecting to Databases from UNIX or Linux
- Connecting to an IBM DB2 Universal Database
  - Configuring Native Connectivity
- Connecting to a Microsoft SQL Server Database
- Connecting to an Oracle Database
  - Configuring Native Connectivity
- Connecting to a Teradata Database
  - Configuring ODBC Connectivity
- Connecting to a JDBC Data Source
- Connecting to an ODBC Data Source
- Sample odbc.ini File
Appendix D: Updating the DynamicSections Parameter of a DB2 Database
- DynamicSections Parameter Overview
- Setting the DynamicSections Parameter
  - Downloading and Installng the DDconnect JDBC Utility
  - Running the Test for JDBC Tool

Application Service Guide

10.4.1
- 10.5.5.1
- 10.5.4
- 10.5.3
- 10.5.2
- 10.5.1
- 10.5
- 10.4.0

Back Next

Multiple Threads for Each Pipeline Stage

When maximum parallelism is set to a value greater than 1, partitioning is enabled. The Data Integration Service separates a mapping into pipeline stages and uses multiple threads to process each stage.

When you maximize parallelism, the Data Integration Service dynamically performs the following tasks at run time:

Divides the data into partitions.: The Data Integration Service dynamically divides the underlying data into partitions and runs the partitions concurrently. The Data Integration Service determines the optimal number of threads for each pipeline stage. The number of threads used for a single pipeline stage cannot exceed the maximum parallelism value. The Data Integration Service can use a different number of threads for each pipeline stage.
Redistributes data across partition points.: The Data Integration Service dynamically determines the best way to redistribute data across a partition point based on the transformation requirements.

The following image shows an example mapping that distributes data across multiple partitions for each pipeline stage:

The mapping distributes the reader pipeline stage and the first transformation pipeline stage across two partitions. At the second transformation pipeline stage, the mapping redistributes the rows across three partitions. The mapping distributes the writer pipeline stage across three partitions.

In the preceding image, maximum parallelism for the Data Integration Service is three. Maximum parallelism for the mapping is Auto. The Data Integration Service separates the mapping into four pipeline stages and uses a total of 12 threads to run the mapping. The Data Integration Service performs the following tasks at each of the pipeline stages:

At the reader pipeline stage, the Data Integration Service queries the Oracle database system to discover that both source tables, source A and source B, have two database partitions. The Data Integration Service uses one reader thread for each database partition.

At the first transformation pipeline stage, the Data Integration Service redistributes the data to group rows for the join condition across two threads.

At the second transformation pipeline stage, the Data Integration Service determines that three threads are optimal for the Aggregator transformation. The service redistributes the data to group rows for the aggregate expression across three threads.

At the writer pipeline stage, the Data Integration Service does not need to redistribute the rows across the target partition point. All rows in a single partition stay in that partition after crossing the target partition point.

Rename Saved Search

Table of Contents

Application Service Guide

Application Service Guide

Multiple Threads for Each Pipeline Stage

Multiple Threads for Each Pipeline Stage