Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for Amazon S3
  3. PowerExchange for Amazon S3 Configuration Overview
  4. Amazon S3 Connections
  5. PowerExchange for Amazon S3 Data Objects
  6. PowerExchange for Amazon S3 Mappings
  7. PowerExchange for Amazon S3 Lookups
  8. Appendix A: Amazon S3 Data Type Reference
  9. Appendix B: Troubleshooting

PowerExchange for Amazon S3 User Guide

PowerExchange for Amazon S3 User Guide

Working with Multiple Files

Working with Multiple Files

You can read multiple files, which are of flat format type, from Amazon S3 and write data to a target in the native environment.
To read multiple flat files in the native environment, all files must be available in the same Amazon S3 bucket. When you want to read from multiple folders in the Amazon S3 bucket, you must create a
.manifest
file that contains all the source files with the respective absolute path or directory path. You must specify the
.manifest
file name in the following format:
<file_name>.manifest
For example, the
.manifest
file contains source files in the following format:
{ "fileLocations": [ { "URIs": [ "dir1/dir2/dir3/file_1.csv" "dir1/dir2/dir3/file_2.csv" "dir1/file_3.csv" ] }, { "URIPrefixes": [ "dir1/dir2/dir3/" "dir1/dir2/dir4/" ] }, { "WildcardURIs": [ "dir1/dir2/dir3/dir5*/*/*.csv" "dir1/dir2/dir3/dir6/dir7/*" ] } ], "settings": { "stopOnFail": "true" } }
You can specify URIs, URIPrefixes, WildcardURIs, or all sections within fileLocations in the
.manifest
file.
In
Data Preview
, the data of the first file is available in the URI specified in the
.manifest
file. If the URI section is empty, the first file in the folder specified in URIPrefixes is displayed.
You can specify an asterisk (*) wildcard in the file name, which are of flat format type, to fetch files from the Amazon S3 bucket. You can specify the asterisk (*) wildcard to fetch all the files or only the files that match the name pattern. The asterisk (*) wildcard is applicable at the folder and file level in a single bucket. Specify the wildcard character in the following format:
abc*.txt abc.*
For example, if you specify
abc*.txt
, all the file names starting with the term
abc
and ending with the
.txt
file extension are read. If you specify
abc.*
, all the file names starting with the term
abc
are read regardless of the extension.
Use the wildcard character to specify files from a single folder.
You cannot use the wildcard characters to specify folder names. For example,
{ "WildcardURIs": [ "multiread_wildcard/dir1*/", "multiread_wildcard/*/" ] }
PowerExchange for Amazon S3 supports only asterisk (*) wildcard character.

0 COMMENTS

We’d like to hear from you!