Common Content for Data Engineering
- Common Content for Data Engineering 10.5.1
- All Products
... -o option_type.option_name=value
Option
| Description
|
---|---|
AdvancedProfilingServiceOptions.ColumnsPerMapping
| Limits the number of columns that can be profiled in a single mapping due to save memory and disk space. Default is 5. If you profile a source with over 100 million rows decrease the value to as low as 1.
|
AdvancedProfilingServiceOptions.ExecutionPoolSize
| Maximum number of threads to run mappings.
|
AdvancedProfilingServiceOptions.MaxMemPerRequest
| Maximum amount of memory, in bytes, that the Data Integration Service can allocate for each mapping run for a single profile request.
Default is 536,870,912.
|
AdvancedProfilingServiceOptions.MaxNumericPrecision
| Maximum number of digits for a numeric value.
|
AdvancedProfilingServiceOptions.MaxParallelColumnBatches
| Number of threads that can run mappings at the same time. Default is 1.
|
AdvancedProfilingServiceOptions.MaxStringLength
| Maximum length of a string that the profiling service can process.
|
AdvancedProfilingServiceOptions.MaxValueFrequencyPairs
| Maximum number of value/frequency pairs to store in the profiling warehouse. Default is 16,000.
|
AdvancedProfilingServiceOptions.MinPatternFrequency
| Minimum number of patterns to display for a profile.
|
AdvancedProfilingServiceOptions.ReservedThreads
| Number of threads of the Maximum Execution Pool Size that are for priority requests. Default is 1.
|
AdvancedProfilingServiceOptions.ValueFrequencyMemSize
| Amount of memory to allow for value-frequency pairs. Default is 64 megabytes.
|
DataObjectCacheOptions.CacheConnection
| The database connection name for the database that stores the data object cache. Enter a valid connection object name.
|
DataObjectCacheOptions.CacheRemovalTime
| The number of milliseconds the Data Integration Service waits before cleaning up cache storage after a refresh. Default is 3,600,000.
|
DeploymentOptions.DefaultDeploymentMode
| Determines whether to enable and start each application after you deploy it to a Data Integration Service.
Enter one of the following options:
|
DataObjectCacheOptions.EnableNestedLDOCache
| Indicates that the Data Integration Service can use cache data for a logical data object used as a source or a lookup in another logical data object during a cache refresh. If false, the Data Integration Service accesses the source resources even if you enabled caching for the logical data object used as a source or a lookup.
For example, logical data object LDO3 joins data from logical data objects LDO1 and LDO2. A developer creates a mapping that uses LDO3 as the input and includes the mapping in an application. You enable caching for LDO1, LDO2, and LDO3. If you enable nested logical data object caching, the Data Integration Service uses cache data for LDO1 and LDO2 when it refreshes the cache table for LDO3. If you do not enable nested logical data object caching, the Data Integration Service accesses the source resources for LDO1 and LDO2 when it refreshes the cache table for LDO3.
Default is false.
|
DataObjectCacheOptions.MaxConcurrentRefreshRequests
| Maximum number of cache refreshes that can occur at the same time.
|
ExecutionContextOptions.Spark.MSPEnableUnassignedData
| If true, enables midstream parsing functionality that captures unparsed data in the source string and saves it in an
UnassignedData array as an
unidentifiedDataItem .
By default, if the parser encounters a data field that it cannot parse, the data is ignored. But the source string complex data schema can change. For example, a software update on the server might change the JSON or XML. This option allows you to capture the data for analysis.
Default is false.
|
ExecutionOptions.BigDataJobRecovery
| If true, enables data engineering job recovery and distributed queueing for deployed jobs configured to run on the Spark engine.
Default is false.
|
ExecutionOptions.CacheDirectory
| Directory for index and data cache files for transformations. Default is
<home directory>/cache .
Enter a list of directories separated by semicolons to increase performance during cache partitioning for Aggregator, Joiner, or Rank transformations.
You cannot use the following characters in the directory path:
|
ExecutionOptions.DisHadoopKeytab
| The file path to the Kerberos keytab file on the machine on which the Data Integration Service runs.
|
ExecutionOptions.DisHadoopPrincipal
| Service Principal Name (SPN) of the Data Integration Service to connect to a Hadoop cluster that uses Kerberos authentication.
|
ExecutionOptions.DISHomeDirectory
| Root directory accessible by the node. This is the root directory for other service directories. Default is
<Informatica installation directory>/tomcat/bin . If you change the default value, verify that the directory exists.
You cannot use the following characters in the directory path:
|
ExecutionOptions.EnableOSProfile
| Indicates that the Data Integration Service can use operating system profiles for mapping execution. You can enable operating system profiles if the Data Integration Service runs on UNIX or Linux.
Default is false.
|
ExecutionOptions.HadoopDistributionDir
| The directory containing a collection of Hadoop JARS on the cluster from the RPM Install locations. The directory contains the minimum set of JARS required to process Informatica mappings in a Hadoop environment. Type
/<PowerCenterBigDataEditionInstallationDirectory>/Informatica/services/shared/hadoop/[Hadoop_distribution_name] .
|
ExecutionOptions.HadoopInfaHomeDir
| The PowerCenter Big Data Edition home directory on every data node created by the Hadoop RPM install. Type
/<PowerCenterBigDataEditionInstallationDirectory>/Informatica .
|
ExecutionOptions.MaxHadoopBatchExecutionPoolSize
| Maximum number of deployed jobs that can run concurrently in the Hadoop environment. The Data Integration Service moves Hadoop jobs from the queue to the Hadoop job pool when enough resources are available. Default is 100.
|
ExecutionOptions.MaxMappingParallelism
| Maximum number of parallel threads that process a single mapping pipeline stage.
When you set the value greater than one, the Data Integration Service enables partitioning for mappings and for mappings converted from profiles. The service dynamically scales the number of partitions for a mapping pipeline at run time. Increase the value based on the number of CPUs available on the nodes where mappings run.
In the Developer tool, developers can change the maximum parallelism value for each mapping. When maximum parallelism is set for both the Data Integration Service and the mapping, the Data Integration Service uses the minimum value when it runs the mapping.
Default is 1. Maximum is 64.
|
ExecutionOptions.MaxMemorySize
| Maximum amount of memory, in bytes, that the Data Integration Service can allocate for running all requests concurrently when the service runs jobs in the Data Integration Service process. When the Data Integration Service runs jobs in separate local or remote processes, the service ignores this value. If you do not want to limit the amount of memory the Data Integration Service can allocate, set this property to 0.
If the value is greater than 0, the Data Integration Service uses the property to calculate the maximum total memory allowed for running all requests concurrently. The Data Integration Service calculates the maximum total memory as follows:
Maximum Memory Size + Maximum Heap Size + memory required for loading program components
Default is 0.
If you run profiles or data quality mappings, set this property to 0.
|
ExecutionOptions.MaxNativeBatchExecutionPoolSize
| Maximum number of deployed jobs that can run concurrently in the native environment. The Data Integration Service moves native mapping jobs from the queue to the native job pool when enough resources are available. Default is 10.
|
ExecutionOptions.MaxOnDemandExecutionPoolSize
| Maximum number of on-demand jobs that can run concurrently. Jobs include data previews, profiling jobs, REST and SQL queries, web service requests, and mappings run from the Developer tool. All jobs that the Data Integration Service receives contribute to the on-demand pool size. The Data Integration Service immediately runs on-demand jobs if enough resources are available. Otherwise, the Data Integration Service rejects the job. Default is 10.
|
ExecutionOptions.OutOfProcessExecution
| Runs jobs in the Data Integration Service process, in separate DTM processes on the local node, or in separate DTM processes on remote nodes. Configure the property based on whether the Data Integration Service runs on a single node or a grid and based on the types of jobs that the service runs.
Enter one of the following options:
Default is OUT_OF_PROCESS.
|
ExecutionOptions.RejectFilesDirectory
| Directory for reject files. Reject files contain rows that were rejected when running a mapping. Default is
<home directory>/reject .
You cannot use the following characters in the directory path:
|
ExecutionOptions.SourceDirectory
| Directory for source flat files used in a mapping. Default is
<home directory>/source .
If the Data Integration Service runs on a grid, you can use a shared directory to create one directory for source files. If you configure a different directory for each node with the compute role, ensure that the source files are consistent among all source directories.
You cannot use the following characters in the directory path:
|
ExecutionOptions.TargetDirectory
| Default directory for target flat files used in a mapping. Default is
<home directory>/target .
Enter a list of directories separated by semicolons to increase performance when multiple partitions write to the flat file target.
If the Data Integration Service runs on a grid, you can use a shared directory to create one directory for target files. If you configure a different directory for each node with the compute role, ensure that the target files are consistent among all target directories.
You cannot use the following characters in the directory path:
|
ExecutionOptions.TemporaryDirectories
| Directory for temporary files created when jobs are run. Default is
<home directory>/disTemp .
Enter a list of directories separated by semicolons to optimize performance during profile operations and during cache partitioning for Sorter transformations.
You cannot use the following characters in the directory path:
|
HttpConfigurationOptions.AllowedHostNames
| List of constants or Java regular expression patterns compared to the host name of the requesting machine. The host names are case sensitive. Use a space to separate multiple constants or expressions.
If you configure this property, the Data Integration Service accepts requests from host names that match the allowed host name pattern. If you do not configure this property, the Data Integration Service uses the Denied Host Names property to determine which clients can send requests.
|
HttpConfigurationOptions.AllowedIPAddresses
| List of constants or Java regular expression patterns compared to the IP address of the requesting machine. Use a space to separate multiple constants or expressions.
If you configure this property, the Data Integration Service accepts requests from IP addresses that match the allowed address pattern. If you do not configure this property, the Data Integration Service uses the Denied IP Addresses property to determine which clients can send requests.
|
HttpConfigurationOptions.DeniedHostNames
| List of constants or Java regular expression patterns compared to the host name of the requesting machine. The host names are case sensitive. Use a space to separate multiple constants or expressions.
If you configure this property, the Data Integration Service accepts requests from host names that do not match the denied host name pattern. If you do not configure this property, the Data Integration Service uses the Allowed Host Names property to determine which clients can send requests.
|
HttpConfigurationOptions.DeniedIPAddresses
| List of constants or Java regular expression patterns compared to the IP address of the requesting machine. Use a space to separate multiple constants or expressions.
If you configure this property, the Data Integration Service accepts requests from IP addresses that do not match the denied IP address pattern. If you do not configure this property, the Data Integration Service uses the Allowed IP Addresses property to determine which clients can send requests.
|
HttpConfigurationOptions.HTTPProtocolType
| Security protocol that the Data Integration Service uses. Enter one of the following values:
When you set the HTTP protocol type to HTTPS or Both, you enable Transport Layer Security (TLS) for the service.
You can also enable TLS for each web service deployed to an application. When you enable HTTPS for the Data Integration Service and enable TLS for the web service, the web service uses an HTTPS URL. When you enable HTTPS for the Data Integration Service and do not enable TLS for the web service, the web service can use an HTTP URL or an HTTPS URL. If you enable TLS for a web service and do not enable HTTPS for the Data Integration Service, the web service does not start. Default is HTTP.
|
HttpProxyServerOptions.HttpProxyServerDomain
| Domain for authentication.
|
HttpProxyServerOptions.HttpProxyServerHost
| Name of the HTTP proxy server.
|
HttpProxyServerOptions.HttpProxyServerPassword
| Password for the authenticated user. The Service Manager encrypts the password. This is required if the proxy server requires authentication.
|
HttpProxyServerOptions.HttpProxyServerPort
| Port number of the HTTP proxy server.
Default is 8080.
|
HttpProxyServerOptions.HttpServerUser
| Authenticated user name for the HTTP proxy server. This is required if the proxy server requires authentication.
|
LoggingOptions.LogLevel
| Level of error messages that the Data Integration Service writes to the Service log. Choose one of the following message levels: Fatal, Error, Warning, Info, Trace, or Debug.
|
MappingServiceOptions.MaxMemPerRequest
| The behavior of Maximum Memory Per Request depends on the following Data Integration Service configurations:
Default is 536,870,912.
|
MappingServiceOptions.MaxNotificationThreadPoolSize
| Allocates the number of threads that send notifications to the client.
|
Modules.MappingService
| Enter false to disable the module that runs mappings and previews. Default is true.
|
Modules.ProfilingService
| Enter false to disable the module that runs profiles and generates scorecards. Default is true.
|
Modules.RESTService
| Enter false to disable the module that runs the REST web service. Default is true.
|
Modules.SQLService
| Enter false to disable the module that runs SQL queries against an SQL data service. Default is true.
|
Modules.WebService
| Enter false to disable the module that runs web service operation mappings. Default is true.
|
Modules.WorkflowOrchestrationService
| Enter false to disable the module that runs workflows. Default is true.
|
PassThroughSecurityOptions.AllowCaching
| Allows data object caching for all pass-through connections in the Data Integration Service. Populates data object cache using the credentials in the connection object.
When you enable data object caching with pass-through security, you might allow unauthorized access to some data.
|
ProfilingServiceOptions.ExportPath
| Location to export profile results. Enter the file system path. Default is ./ProfileExport.
|
ProfilingServiceOptions.MaxExecutionConnections
| Maximum number of database connections for each profiling job.
|
ProfilingServiceOptions.MaxPatterns
| Maximum number of patterns to display for a profile.
|
ProfilingServiceOptions.MaxProfileExecutionPoolSize
| Maximum number of threads to run profiling.
|
ProfilingServiceOptions.MaxRanks
| Number of minimum and maximum values to display for a profile. Default is 5. Default is 10.
|
ProfilingServiceOptions.ProfileWarehouseConnectionName
| Connection object name for the connection to the profiling warehouse.
|
RepositoryOptions.RepositoryPassword
| User password to access the Model repository.
|
RepositoryOptions.RepositorySecurityDomain
| LDAP security domain name if you are using LDAP. If you are not using LDAP the default domain is native.
|
RepositoryOptions.RepositoryServiceName
| Service that stores run-time metadata required to run mappings and SQL data services.
|
RepositoryOptions.RepositoryUserName
| User name to access the Model repository. The user must have the Create Project privilege for the Model Repository Service.
|
ResultSetCacheOptions.EnableEncryption
| Indicates whether result set cache files are encrypted using 128-bit AES encryption. Valid values are true or false. Default is true.
|
ResultSetCacheOptions.FileNamePrefix
| The prefix for the names of all result set cache files stored on disk. Default is RSCACHE.
|
SQLServiceOptions.DTMKeepAliveTime
| Number of milliseconds that the DTM process stays open after it completes the last request. Identical SQL queries can reuse the open process.
Use the keepalive time to increase performance when the time required to process the SQL query is small compared to the initialization time for the DTM process. If the query fails, the DTM process terminates. Must be greater than or equal to 0. 0 means that the Data Integration Service does not keep the DTM process in memory. Default is 0.
You can also set this property for each SQL data service that is deployed to the Data Integration Service. If you set this property for a deployed SQL data service, the value for the deployed SQL data service overrides the value you set for the Data Integration Service.
|
SQLServiceOptions.MaxMemPerRequest
| The behavior of Maximum Memory Per Request depends on the following Data Integration Service configurations:
Default is 50,000,000.
|
SQLServiceOptions.SkipLogFiles
| Prevents the Data Integration Service from generating log files when the SQL data service request completes successfully and the tracing level is set to INFO or higher. Default is false.
|
SQLServiceOptions.TableStorageConnection
| Relational database connection that stores temporary tables for SQL data services. By default, no connection is selected.
|
WorkflowOrchestrationServiceOptions.DBName
| Connection name of the database that stores run-time metadata for workflows.
|
WorkflowOrchestrationServiceOptions.MaxWorkerThreads
| The maximum number of threads that the Data Integration Service can use to run parallel tasks between a pair of inclusive gateways in a workflow. The default value is 10.
If the number of tasks between the inclusive gateways is greater than the maximum value, the Data Integration Service runs the tasks in batches that the value specifies. For example, if the Maximum Worker Threads value is 10, the Data Integration Service runs the tasks in batches of ten.
|
WSServiceOptions.DTMKeepAliveTime
| Number of milliseconds that the DTM process stays open after it completes the last request. Web service requests that are issued against the same operation can reuse the open process.
Use the keepalive time to increase performance when the time required to process the request is small compared to the initialization time for the DTM process. If the request fails, the DTM process terminates. Must be greater than or equal to 0. 0 means that the Data Integration Service does not keep the DTM process in memory. Default is 5000.
You can also set this property for each web service that is deployed to the Data Integration Service. If you set this property for a deployed web service, the value for the deployed web service overrides the value you set for the Data Integration Service.
|
WSServiceOptions.MaxMemPerRequest
| The behavior of Maximum Memory Per Request depends on the following Data Integration Service configurations:
Default is 50,000,000.
|
WSServiceOptions.SkipLogFiles
| Prevents the Data Integration Service from generating log files when the web service request completes successfully and the tracing level is set to INFO or higher. Default is false.
|
WSServiceOptions.WSDLLogicalURL
| Prefix for the WSDL URL if you use an external HTTP load balancer. For example,
http://loadbalancer:8080
The Data Integration Service requires an external HTTP load balancer to run a web service on a grid. If you run the Data Integration Service on a single node, you do not need to specify the logical URL.
|