Data Engineering Integration
- Data Engineering Integration H2L
- All Products
Workload Characteristics
| EC2 Instance Type Recommendation
|
|---|---|
CPU-bound mapping and pass-through mapping
| Use C series instances for task nodes. Task nodes are cheaper than core nodes.
These task nodes do not have a persistent data store.
|
I/O-bound mapping
| For processing a low volume of data up to 5TB, use core nodes with default storage. For example, the d2.2xlarge instance has default storage of 6x2000 HDD. In general, HDD storage is appropriate for I/0-bound mapping workloads.
For a large volume of data (10 TB or higher), add additional EBS HDD volumes to core nodes.
|
Mixed load
| Use core nodes with additional EBS HDD and for computational needs dynamically add task nodes to meet your cluster's varying capacity requirements.
General-purpose SSD are faster than HDD but more expensive.
|