Table of Contents

Search

  1. Preface
  2. Advanced clusters
  3. Setting up AWS
  4. Setting up Google Cloud
  5. Setting up Microsoft Azure
  6. Setting up a self-service cluster
  7. Setting up a local cluster
  8. Advanced configurations
  9. Troubleshooting
  10. Appendix A: Command reference

Advanced Clusters

Advanced Clusters

AWS properties

AWS properties

Create an
advanced configuration
to configure properties for an
advanced cluster
. The properties describe where you want to start the cluster on your cloud platform and the infrastructure that you want to use.

Basic configuration

The following table describes the basic properties:
Property
Description
Name
Name of the
advanced configuration
.
Description
Description of the
advanced configuration
.
Runtime Environment
Runtime environment to associate with the
advanced configuration
. The runtime environment can contain only one Secure Agent. A runtime environment cannot be associated with more than one configuration.
If you don't select a runtime environment, the validation process can't validate the communication link to the Secure Agent and that the Secure Agent has the minimum runtime requirements to start a cluster.
Cloud Platform
Cloud platform that hosts the cluster.
Select Amazon Web Services (AWS).
Private Cluster
Creates an
advanced cluster
in which cluster resources have only private IP addresses.
When you choose to create a private cluster, you must specify the VPC and subnet in the advanced properties.

CLAIRE-powered configuration

Enable a CLAIRE-powered configuration to allow CLAIRE to configure the cluster to stay within cost boundaries and make recommendations to improve cluster performance and to reduce infrastructure costs. You can use a CLAIRE-powered configuration if CLAIRE recommendations are enabled in your organization.
The following table describes the CLAIRE-powered configuration properties:
Property
Description
Optimization Preference
Cost or performance preference that CLAIRE uses to balance infrastructure costs with cluster performance.
Target Average Cost per Hour (USD)
Target average cost per hour in USD to run the
advanced cluster
.
Maximum Cost per Hour (USD)
Maximum cost per hour in USD to run the
advanced cluster
.
If you enable a CLAIRE-powered configuration, you configure fewer platform properties.

Platform configuration

The following table describes the platform properties:
Property
Description
Region
Region in which to create the cluster. Use the drop-down menu to view the regions that you can use.
Master Instance Type
Instance type to host the master node. Use the drop-down menu to view the instance types that you can use in your region.
For information to verify that the instance type that you select from the drop-down menu is supported in the selected availability zones and your AWS account, refer to the AWS documentation.
Not applicable in a CLAIRE-powered configuration.
Master Instance Profile
Instance profile to be attached to the master node. The name must consist of alphanumeric characters with no spaces. You can also include any of the following characters:
_+=,.@-
If you specify the master instance profile, you must also specify the worker instance profile.
Worker Instance Type
Instance type to host the worker nodes. Use the drop-down menu to view the instance types that you can use in your region.
For information to verify that the instance type that you select from the drop-down menu is supported in the selected availability zones and your AWS account, refer to the AWS documentation.
Not applicable in a CLAIRE-powered configuration.
Worker Instance Profile
Instance profile to be attached to the worker nodes. The name must consist of alphanumeric characters with no spaces. You can also include any of the following characters:
_+=,.@-
If you specify the worker instance profile, you must also specify the master instance profile.
Number of Worker Nodes
Number of worker nodes in the cluster. Specify the minimum and maximum number of worker nodes.
Not applicable in a CLAIRE-powered configuration.
Enable Spot Instances
Indicates whether to use Spot Instances for worker nodes.
Not applicable in a CLAIRE-powered configuration.
Spot Instance Price Ratio
Maximum percentage of On-Demand Instance price to pay for Spot Instances. Specify an integer value between 1 and 100.
Required if you enable Spot Instances. If you do not enable Spot Instances, this property is ignored.
Not applicable in a CLAIRE-powered configuration.
Enable High Availability
Indicates whether the cluster is highly available. An odd number of master nodes will be created based on the number of availability zones or subnets that you provide. You must provide at least three availability zones or subnets.
For example, if you provide six availability zones, five master nodes are created with each master node in a different availability zone.
When you provide multiple availability zones or subnets, worker nodes are highly available. Worker nodes are created across the availability zones or subnets regardless of whether high availability is enabled.
For more information about high availability, refer to the Kubernetes documentation.
Not applicable in a CLAIRE-powered configuration.
Availability Zones
List of AWS availability zones where cluster nodes are created. The master node is created in the first availability zone in the list. If multiple zones are specified, the cluster nodes are created across the specified zones.
If you specify availability zones, the zones must be unique and be within the specified region.
The availability zones that you can use depend on your AWS account. To check which zones are available for your account, refer to the AWS documentation.
Required if you do not specify a VPC. If you specify a VPC, you cannot provide availability zones. You must provide subnets instead of availability zones.
EBS Volume Type
Type of Amazon EBS volumes to attach to Amazon EC2 instances as local storage. You can use only EBS General Purpose SSD (gp2).
Not applicable in a CLAIRE-powered configuration.
EBS Volume Size
Size of the EBS volume to attach to a worker node for temporary storage during data processing. The volume size scales between the minimum and maximum based on job requirements. The range must be between 50 GB and 16 TB.
By default, the minimum and maximum volume sizes are 100 GB.
This configuration property does not apply to Graviton-enabled clusters, as Graviton does not support storage scaling.
When the volume size scales down, the jobs that are currently running on the cluster might take longer to complete.
Not applicable in a CLAIRE-powered configuration.
Cluster Shutdown
Cluster shutdown method. You can select one of the following cluster shutdown methods:
  • Smart shutdown. The Secure Agent stops the cluster when no job is expected during the defined idle timeout, based on historical data.
  • Idle timeout. The Secure Agent stops the cluster after the amount of idle time that you define.
Not applicable in a CLAIRE-powered configuration.
Mapping
Task Timeout
Amount of time to wait for a
mapping
task to complete before it is terminated. By default, a
mapping
task does not have a timeout.
If you specify a timeout, a value of at least 10 minutes is recommended. The timeout begins when the
mapping
task is submitted to the Secure Agent.
Staging Location
Location on Amazon S3 for staging data.
You can use a path that includes the folders in the bucket, such as
<bucket name>/<folder name>
. Specify an S3 bucket in the same region as the cluster to improve latency.
Log Location
Location on Amazon S3 to store logs that are generated when you run an
advanced job
.
You can use a path that includes the folders in the bucket, such as
<bucket name>/<folder name>
. Specify an S3 bucket in the same region as the cluster to improve latency.

Advanced configuration

The following table describes the advanced properties:
Property
Description
VPC
Amazon Virtual Private Cloud (VPC) in which to create the cluster. The VPC must be in the specified region.
If you choose to not create a private cluster, you do not need to specify a VPC. In this case, the agent creates a VPC on your AWS account based on the region and the availability zones that you select.
If you plan to use the Sequence Generator transformation, you must specify a VPC and subnets.
Subnets
Subnets in which to create cluster nodes. Use a comma-separated list to specify the subnets.
Required if a VPC is specified. Each subnet must be in a different availability zone within the specified VPC.
If you do not specify a VPC, you cannot specify subnets. You must provide availability zones instead of subnets.
If you plan to use the Sequence Generator transformation, you must specify a VPC and subnets.
Initialization Script Path
Amazon S3 file path of the initialization script to run on each cluster node when the node is created. Use the format:
<bucket name>/<folder name>
. The script can reference other init scripts in the same folder or in a subfolder.
The script must be a bash script.
ELB Security Group
Defines the inbound rules between the Kubernetes API server and clients that are external to the
advanced cluster
. Also defines the outbound rules between the Kubernetes API server and the cluster nodes. This security group attaches to the load balancer that the Secure Agent provisions for the
advanced cluster
.
When you specify a security group, VPC and subnet information are required.
For more information about security groups, see Step 4. Create user-defined security groups for Amazon EC2.
Master Security Group ID
Defines the inbound rules between master nodes and worker nodes in the
advanced cluster
, ELB security group, Secure Agent, and outbound rules to other nodes. This security group attaches to all master nodes of the cluster.
When you specify a security group, VPC and subnet information are required.
For more information about security groups, see Step 4. Create user-defined security groups for Amazon EC2.
Worker Security Group ID
Defines the inbound and outbound rules between worker nodes in the
advanced cluster
and other nodes. This security group is attached to all worker nodes of the cluster.
When you specify a security group, VPC and subnet information are required.
For more information about security groups, see Step 4. Create user-defined security groups for Amazon EC2.
AWS Tags
AWS tags to apply to cluster nodes. Each tag has a key and a value. The key can be up to 127 characters long. The value can be up to 256 characters long.
You can list a maximum of 30 tags. The Secure Agent also assigns default tags to cloud resources. The default tags do not contribute to the limit of 30 tags.
Issues can occur when you override default tags. Do not override the following default tags:
  • Name
  • KubernetesCluster
  • k8s.io/cluster-autoscaler/enabled
  • k8s.io/cluster-autoscaler/<cluster instance ID>.k8s.local
The key cannot start with "aws:" because AWS reserves this phrase for their use.
Tags cannot include UTF-8 characters \u241e and \u241f that correspond to record and unit separators represented by ASCII control characters 30 and 31.

Runtime configuration

The following table describes the runtime properties:
Property
Description
Encrypt Data
Indicates whether temporary data on the cluster is encrypted.
Encrypting temporary data might slow down job performance.
Runtime Properties
Custom properties to customize the cluster and the jobs that run on the cluster.

0 COMMENTS

We’d like to hear from you!