Preface
Introduction to Data Transformation
- Data Transformation Overview
- Data Transformation Process Architecture
- Data Transformation Components
Data Processor Transformation
- Data Processor Transformation Overview
- Data Processor Transformation Views
- Data Processor Transformation Ports
- Startup Component
- References
- Data Processor Transformation Settings
- Events
  - Event Types
  - Data Processor Events View
- Logs
- Data Processor Transformation Development
- Data Processor Transformation Export and Import
- Data Processor Transformation Validation
  - Using a Speed-enhanced Data Transformation Engine for VRL Validations
- Data Processor Transformation in a Non-native Environment
Wizard Input and Output Formats
- Wizard Input and Output Formats Overview
- Avro
- COBOL Processing Library
- JSON
- Parquet
- XML
  - Creating a Transformation that Transforms XML
Relational Input and Output
- Relational Input and Output Overview
- Relational Input
- Relational Output
Using the IntelliScript Editor
- IntelliScript Editor Overview
  - Creating a Script
- Opening an IntelliScript Editor
- Editing Procedures
- IntelliScript Editor Menus
XMap
- XMap Overview
- XMap Schemas
- Mapping Statements
- XPath Expressions
- XMap Variables
  - Creating a Variable in the XMap Editor
- XMap Example
Libraries
- Libraries Overview
- Library Structure
- Element Properties
- Library Management
- Edit Libraries with the Library Editor
- Edit Libraries with the IntelliScript Editor
Schema Object
- Schema Object Overview
- Schema Files
- Schema Object Overview View
- Schema Object Schema View
- Schema Object Advanced View
- Creating a Schema Object
- Schema Updates
  - Schema Synchronization
  - Schema File Edits
    - Setting a Default Schema File Editor
    - Editing a Schema File
Command Line Interface
- Command Line Interface Overview
- CM_console
Scripts
- Scripts Overview
- Script Components
- Script Component Properties
- Script Startup Components
  - Setting the Startup Component with the IntelliScript Editor
- Example Sources
- IntelliScript Editor
- Validate a Script
- Sample Scripts
  - Importing a Sample Script
Parsers
- Parsers Overview
- Platform-Independent Parsers
  - Newline Markers
  - File Paths
- Parser Component Reference
  - Parser
Script Ports
- Script Ports Overview
- Script Port Component Reference
Document Processors
- Document Processors Overview
- Defining a Document Processor
  - Display of Document Processor Output
- Document Processor Component Reference
- TextML XML Schema
- PdfToTxt_4 Table Configuration Editor
  - Editor Options
  - PDF Conversion Example
    - Configuring the First Table
    - Configuring the Second Table
Formats
- Formats Overview
- Standard Format Properties
- Format Component Reference
- Delimiters Component Reference
- Format Preprocessor Component Reference
  - HtmlProcessor
  - RtfProcessor
Data Holders
- Data Holders Overview
- XML Schemas
- Using a Schema to Map Anchors
- Generating Valid XML
  - Role of Schemas in Parsing
  - Role of Schemas in Serialization and Mapping
- Variables
- Variable Component Reference
  - Variable
- Multiple-Occurrence Data Holders
Anchors
- Anchors Overview
- Mapping Content Anchors to Data Holders
- Defining Anchors
- Standard Anchor Properties
- How a Parser Searches for Anchors
- Anchor Component Reference
- Searcher Component Reference
- Anchor Subcomponent Reference
Transformers
- Transformers Overview
- Defining Transformers
- Standard Transformer Properties
- Transformer Component Reference
Actions
- Actions Overview
- Standard Action Properties
- Action Component Reference
- Action Subcomponent Reference
Serializers
- Serializers Overview
- Serialization Anchors
  - Example of Serialization Anchors
  - Sequence of Serialization Anchors
- Standard Serializer Properties
- Serializer Component Reference
  - Serializer
- Serialization Anchor Component Reference
Mappers
- Creating a Mapper
- Components Nested within a Mapper
- Mapper Example
- Standard Mapper Properties
- Mapper Component Reference
  - Mapper
- Mapper Anchor Component Reference
Locators, Keys, and Indexing
- Overview of Locators, Keys, and Indexing
- Example of Locators
- Example of Indexing by Key
- Source and Target Properties
  - Source Property
  - Target Property
- Standard Locator and Key Properties
- Locator and Key Component Reference
Streamers
- Streamers Overview
- Text Streamers
- XML Streamers
- Standard Streamer Properties
- Streamer Component Reference
- Streamer Subcomponent Reference
Validators, Notifications, and Failure Handling
- Overview of Validators, Notifiers, and Failure Handling
- Failure Handling
  - Using the Optional Property to Handle Failures
  - Writing a Failure Message to the User Log
    - Configuring User Log Output
    - Viewing the User Log
- Validators
- Standard Validator Properties
- Validator Component Reference
- Notifications
- Notification Component Reference
Validation Rules
- Validation Rules Overview
- Validation Rules Element Reference
- Edit the Validation Rules in an External Editor
- Create a Validation Rules Object
- Import a Data Transformation Service with Validation Rules
Custom Script Components
- Custom Script Components Overview
- Custom Component Example
- Custom Component Properties
- Developing a Custom Component
  - Java Interface Example
  - Sample Custom Java Components
- Configuring a Custom Component
  - Sample Scripts Containing Custom Components

User Guide

10.5.6
- 10.5.2
- 10.5
- 10.4.0

Back Next

Parquet

Use the wizard to create a transformation with Parquet input or output. When you create a Data Processor transformation to transform the Parquet format, you select a Parquet schema or example file that defines the expected structure of the Parquet data. The wizard creates components that transform Parquet format to other formats, or from other formats to Parquet format. After the wizard creates the transformation, you can further configure the transformation to determine the mapping logic.

Apache Parquet is a columnar storage format that can be processed in a Hadoop environment. Parquet is implemented to address complex nested data structures, and uses a record shredding and assembly algorithm. For more information about Parquet, see http://parquet.incubator.apache.org/documentation/latest//.

A transformation that reads Parquet input or output relies on a schema. When the transformation reads or writes Parquet data, the transformation uses the schema to interpret the hierarchy.

After you create a Data Processor transformation for Parquet input, you add it to a mapping with a complex file reader. The complex file reader passes Parquet input to the transformation. For a Data Processor transformation with Parquet output, you add a complex file writer to the mapping to receive the output from the transformation.

Wizard Input and Output Formats

Creating a Transformation with Parquet Input or Output

Configure the Complex File Reader For Parquet Input

Configure a Transformation with Parquet Output

Download Guide

Watch

Comments

Communities

Knowledge Base

Success Portal