- Administrator
- All Products
Property
| Description
|
---|---|
Name
| Name of the
advanced configuration .
|
Description
| Description of the
advanced configuration .
|
Runtime Environment
| Runtime environment to associate with the
advanced configuration . The runtime environment can contain only one Secure Agent. A runtime environment cannot be associated with more than one configuration.
If you don't select a runtime environment, the validation process can't validate the communication link to the Secure Agent and that the Secure Agent has the minimum runtime requirements to start a cluster.
|
Cloud Platform
| Cloud platform that hosts the cluster.
Select Self-Service Cluster.
|
Property
| Description
|
---|---|
Kubeconfig File Path
| Path of the kubeconfig file.
A kubeconfig file organizes information about clusters, users, and authentication mechanisms.
Example:
<directory name>/<file_name>.yaml
You can save the YAML file in any directory on the Secure Agent machine.
|
Kube Context Name
| Name of the cluster context.
A context defines a named cluster and user tuple which is used to send requests to the specified cluster using the provided authentication information.
|
Cluster Version
| Version of the Kubernetes cluster server.
The
advanced configuration validates the major and minor versions of the Kubernetes cluster server, but does not validate the patch release version numbers.
|
Namespace
| Namespace where Informatica deploys resources.
|
Number of Worker Nodes
| Number of worker nodes in the cluster. Specify the minimum and maximum number of worker nodes.
|
Cluster Idle Timeout
| Amount of time before Informatica-created cluster resource objects are deleted due to inactivity.
|
Mapping Task Timeout
| Amount of time to wait for a
mapping task to complete before it is terminated. By default, a
mapping task does not have a timeout.
If you specify a timeout, a value of at least 10 minutes is recommended. The timeout begins when the
mapping task is submitted to the Secure Agent.
|
Staging Location
| Complete path of the cloud location for staging data.
Specify the path in one of the following formats:
The region is optional. The default region is
westus2 .
The Secure Agent needs permissions to access the staging location to store staging files at run time. You must provide appropriate IAM access permissions to both the Secure Agent machine and the worker nodes running in your cluster to access the staging location.
|
Log Location
| Complete path of the cloud location for storing logs.
Specify the path in one of the following formats:
The region is optional. The default region is
westus2 .
The Secure Agent needs permissions to access the staging location to store staging files at run time. You must provide appropriate IAM access permissions to both the Secure Agent machine and the worker nodes running in your cluster to access the staging location.
|
Labels
| Key-value pairs that Informatica attaches to the Kubernetes objects that it creates in the self-service cluster.
You can use labels to organize and select subsets of objects. Each object can have a set of key-value labels defined. Each key must be unique for a given object.
You cannot use the @ symbol in a label. For more information about the supported syntax and character set, see the Kubernetes documentation.
|
Node Selector Labels
| Use node selector labels to identify the nodes in the cluster on which Informatica can create Kubernetes objects.
|
Property
| Description
|
---|---|
Annotations
| Key-value pairs that are used to attach arbitrary non-identifying metadata to objects. You can only define annotations for Pods in a cluster.
For more information about annotations, see the Kubernetes documentation.
|
Tolerations
| Key-value pairs that are used to ensure that Pods are scheduled on appropriate nodes.
When you configure a toleration, set the following properties:
For more information about tolerations, see the Kubernetes documentation.
|
Property
| Description
|
---|---|
Encrypt Data
| Indicates whether temporary data on the cluster is encrypted.
Encrypting temporary data might slow down job performance.
|
Runtime Properties
| Custom properties to customize the cluster and the jobs that run on the cluster.
|
Hello documentation team,
In the context of a self-service cluster, I've understood that:
can you please update?
thanks,
Alessio
Thanks for reaching out, Alessio!
We received the following response to your query from our development team:
“For a self-service cluster, the maximum number of nodes is used by Data Integration to calculate the resource quota available for IICS, and to control job submission to avoid sending too many spark jobs to the cluster at the same time. Otherwise, there can be resource deadlock for spark execution. For example, if the maximum number of nodes is 3, and each node has 8 CPUs, IICS ensures that the spark drivers submitted to the cluster occupy at most 8 CPUs, one third of the resources that IICS can use.
Once the spark jobs are submitted to cluster, it’s up the cluster’s own POD scheduler to decide where to run the jobs. It might run the job on more nodes than the maximum configured number of nodes.
If you want to make sure that Data Integration doesn’t use more than the maximum configured number of nodes, you should define multiple node groups in the cluster, each of which have a specific node label, and provide the corresponding node labels in the advanced cluster configuration so that the Data Integration resources are only allocated on nodes in a certain group.
Stopping a self-service cluster means removing the resources that IICS created in the cluster.”
We'll get the documentation updated in an upcoming release.