Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Data Engineering Integration
  3. Mappings
  4. Mapping Optimization
  5. Sources
  6. Targets
  7. Transformations
  8. Python Transformation
  9. Data Preview
  10. Cluster Workflows
  11. Profiles
  12. Monitoring
  13. Hierarchical Data Processing
  14. Hierarchical Data Processing Configuration
  15. Hierarchical Data Processing with Schema Changes
  16. Intelligent Structure Models
  17. Blockchain
  18. Stateful Computing
  19. Appendix A: Connections Reference
  20. Appendix B: Data Type Reference
  21. Appendix C: Function Reference

Additional Options

Additional Options

The following table describes additional options that you can set for an EMR cluster:
Property
Description
Root device EBS volume size (GB)
Number of GB of the EBS root device volume. Enter a value between 10 and 100.
Default is 10.
Tags
Optional. Tags to propagate to cluster EC2 instances.
Tags assist in identifying EC2 instances.
Format:
TagName1=TagValue1,TagName2=TagValue2
Bootstrap Actions
Optional. Actions to perform after EC2 instances are running, and before applications are installed.
Type the JSON statement here, or provide a path to a file that contains a JSON statement. Format:
file:\\<path_to_policy_config_file>
Custom AMI ID
Optional. ID of a custom Amazon Linux Amazon Machine Image (AMI). Copy the value from the AWS console.
Security Configuration
Optional. The name of a security configuration for authentication and encryption on the cluster.
Amazon EMR supports server-side encryption (SSE) and client-side encryption (CSE) configurations.
You can use the following at-rest security configurations:
  • SSE with Amazon S3-managed keys (SSE-S3)
  • SSE with AWS KMS-managed keys (SSE-KMS)
  • CSE with AWS KMS-managed keys (CSE-KMS)
  • Custom CSE configurations*
You can use the following in-transit security configurations:
  • PEM
  • Custom in-transit configurations*
You can also use custom AMIs for local disk security.
* To use custom security configurations, manually copy the .jar file to the Data Integration Service machine.
Applications
Optional. Applications to add to the default applications that AWS installs.
AWS installs certain applications when it creates an EMR cluster. In addition, you can specify additional applications. Select additional applications from the drop-down list.
This field is equivalent to the Software Configuration list in the AWS EMR cluster creation wizard.
Software Settings
Optional. Custom configurations to apply to the applications installed on the cluster.
This field is equivalent to the Edit Software Settings field in the AWS cluster creation wizard. You can use this as a method to modify the software configuration on the cluster.
Type the configuration JSON statement here, or provide a path to a file that contains a JSON statement. Format:
file:\\<path_to_custom_config_file>
Steps
Optional. Commands to run after cluster creation. For example, you can use this to run Linux commands or HDFS or Hive Hadoop commands.
This field is equivalent to the Add Steps field in the AWS cluster creation wizard.
Type the command statement here, or or provide a path to a file that contains a JSON statement. Format:
file:\\<path_to_command_file>


Updated September 28, 2020