Table of Contents

Search

  1. Preface
  2. Introduction to Data Engineering Streaming
  3. Data Engineering Streaming Administration
  4. Sources in a Streaming Mapping
  5. Targets in a Streaming Mapping
  6. Streaming Mappings
  7. Window Transformation
  8. Appendix A: Connections
  9. Appendix B: Monitoring REST API Reference
  10. Appendix C: Sample Files

Amazon S3 Data Objects

Amazon S3 Data Objects

An Amazon S3 data object is a physical data object that represents data in an Amazon S3 resource. After you configure an Amazon S3 connection, create an Amazon S3 data object to write to Amazon S3 targets.
You can configure the data object write operation properties that determine how data can be loaded to Amazon S3 targets. After you create an Amazon S3 data object, create a write operation. You can use the Amazon S3 data object write operation as a target in streaming mappings. You can create the data object write operation for the Amazon S3 data object automatically. Then, edit the advanced properties of the data object write operation and run a mapping.
When you configure the data operation properties, specify the format in which the data object writes data. You can specify Avro, Parquet, or JSON as format. When you specify Avro format, provide a sample Avro schema in an
.avsc
file. When you specify JSON, you must provide a sample file.
You can pass any payload format directly from source to target in streaming mappings. You can project columns in binary format pass a payload from source to target in its original form or to pass a payload format that is not supported.
You cannot use binary and CSV payloads in a streaming mapping with Amazon S3 targets.
To successfully run a streaming mapping when you select multiple objects from different Amazon S3 buckets, ensure that all the Amazon S3 buckets belong to the same region and use the same credentials to access the Amazon S3 buckets.
You cannot run a mapping with an Amazon S3 data object on MapR and Azure HDInsight distributions.