Network
Data Engineering
Data Engineering Integration
Enterprise Data Catalog
Enterprise Data Preparation
Cloud Integration
Cloud Application Integration
Cloud Data Integration
Cloud Customer 360
DiscoveryIQ
Cloud Data Wizard
Informatica for AWS
Informatica for Microsoft
Cloud Integration Hub
Complex Event Processing
Proactive Healthcare Decision Management
Proactive Monitoring
Real-Time Alert Manager
Rule Point
Data Integration
B2B Data Exchange
B2B Data Transformation
Data Integration Hub
Data Replication
Data Services
Data Validation Option
Fast Clone
Informatica Platform
Metadata Manager
PowerCenter
PowerCenter Express
PowerExchange
PowerExchange Adapters
Data Quality
Axon Data Governance
Data as a Service
Data Explorer
Data Quality
Data Security Group (Formerly ILM)
Data Archive
Data Centric Security
Secure@Source
Secure Testing
Master Data Management
Identity Resolution
MDM - Relate 360
Multidomain MDM
MDM Registry Edition
Process Automation
ActiveVOS
Process Automation
Product Information Management
Informatica Procurement
MDM - Product 360
Ultra Messaging
Ultra Messaging Options
Ultra Messaging Persistence Edition
Ultra Messaging Queuing Edition
Ultra Messaging Streaming Edition
Edge Data Streaming
Knowledge Base
Resources
PAM (Product Availability Matrices)
Support TV
Velocity (Best Practices)
Mapping Templates
Debugging Tools
User Groups
Documentation
English
English
English
Español
Spanish
Deutsch
German
Français
French
日本語
Japanese
한국어
Korean
Português
Portuguese
中文
Chinese
Log Out
Data Engineering Integration
10.2
10.5.2
10.5.1
10.5
10.4.1
10.4.0
10.2.2 HotFix 1
10.2.2 Service Pack 1
10.2.2
10.2.1
10.2 HotFix 2
10.2 HotFix 1
10.2
H2L
Big Data Management User Guide
Data Engineering Integration 10.2
Data Engineering Integration 10.2
All Products
Rename Saved Search
Name
* This field is required
Overwrite saved search
Confirm Deletion
Are you sure you want to delete the saved search?
Table of Contents
Search
No Results
Preface
Introduction to Informatica Big Data Management
Informatica Big Data Management Overview
Example
Big Data Management Tasks
Read from and Write to Big Data Sources and Targets
Perform Data Discovery
Perform Data Lineage on Big Data Sources
Stream Machine Data
Process Streamed Data in Real Time
Manage Big Data Relationships
Big Data Management Component Architecture
Clients and Tools
Application Services
Repositories
Hadoop Environment
Hadoop Utilities
Big Data Management Engines
Blaze Engine Architecture
Application Timeline Server
Spark Engine Architecture
Hive Engine Architecture
Big Data Process
Step 1. Collect the Data
Step 2. Cleanse the Data
Step 3. Transform the Data
Step 4. Process the Data
Step 5. Monitor Jobs
Connections
Connections
Hadoop Connection Properties
HDFS Connection Properties
HBase Connection Properties
HBase Connection Properties for MapR-DB
Hive Connection Properties
JDBC Connection Properties
Sqoop Connection-Level Arguments
Creating a Connection to Access Sources or Targets
Creating a Hadoop Connection
Mappings in the Hadoop Environment
Mappings in the Hadoop Environment Overview
Mapping Run-time Properties
Validation Environments
Execution Environment
Reject File Directory
Updating Run-time Properties for Multiple Mappings
Data Warehouse Optimization Mapping Example
Sqoop Mappings in a Hadoop Environment
Sqoop Mapping-Level Arguments
m or num-mappers
split-by
batch
Configuring Sqoop Properties in the Mapping
Rules and Guidelines for Mappings in a Hadoop Environment
Workflows that Run Mappings in a Hadoop Environment
Configuring a Mapping to Run in a Hadoop Environment
Mapping Execution Plans
Blaze Engine Execution Plan Details
Spark Engine Execution Plan Details
Hive Engine Execution Plan Details
Viewing the Execution Plan for a Mapping in the Developer Tool
Optimization for the Hadoop Environment
Blaze Engine High Availability
Enabling Data Compression on Temporary Staging Tables
Step 1. Configure the Hive Connection to Enable Data Compression on Temporary Staging Tables
Step 2. Configure the Hadoop Cluster to Enable Compression on Temporary Staging Tables
Parallel Sorting
Truncating Partitions in a Hive Target
Scheduling, Queuing, and Node Labeling
Scheduling and Node Labeling Configuration
Queuing Configuration
Troubleshooting a Mapping in a Hadoop Environment
Mapping Objects in the Hadoop Environment
Sources in a Hadoop Environment
Flat File Sources
Hive Sources
Rules and Guidelines for Hive Sources on the Blaze Engine
Complex File Sources
Relational Sources
Sqoop Sources
Rules and Guidelines for Sqoop Sources
Targets in a Hadoop Environment
Flat File Targets
HDFS Flat File Targets
Hive Targets
Rules and Guidelines for Hive Targets on the Blaze Engine
Complex File Targets
Relational Targets
Sqoop Targets
Rules and Guidelines for Sqoop Targets
Transformations in a Hadoop Environment
Transformation Support on the Blaze Engine
Transformation Support on the Spark Engine
Transformation Support on the Hive Engine
Function and Data Type Processing
Rules and Guidelines for Spark Engine Processing
Rules and Guidelines for Hive Engine Processing
Processing Hierarchical Data on the Spark Engine
Processing Hierarchical Data on the Spark Engine Overview
How to Develop a Mapping to Process Hierarchical Data
Complex Data Types
Array Data Type
Map Data Type
Struct Data Type
Rules and Guidelines for Complex Data Types
Complex Ports
Complex Ports in Transformations
Rules and Guidelines for Complex Ports
Creating a Complex Port
Complex Data Type Definitions
Nested Data Type Definitions
Rules and Guidelines for Complex Data Type Definitions
Creating a Complex Data Type Definition
Importing a Complex Data Type Definition
Type Configuration
Changing the Type Configuration for an Array Port
Changing the Type Configuration for a Map Port
Specifying the Type Configuration for a Struct Port
Complex Operators
Extracting an Array Element Using a Subscript Operator
Extracting a Struct Element Using the Dot Operator
Complex Functions
Hierarchical Data Conversion
Convert Relational or Hierarchical Data to Struct Data
Creating a Struct Port
Convert Relational or Hierarchical Data to Nested Struct Data
Creating A Nested Complex Port
Extract Elements from Hierarchical Data
Extracting Elements from a Complex Port
Flatten Hierarchical Data
Flattening a Complex Port
Stateful Computing on the Spark Engine
Stateful Computing on the Spark Engine Overview
Windowing Configuration
Frame
Partition and Order Keys
Rules and Guidelines for Windowing Configuration
Window Functions
LEAD
LAG
Aggregate Functions as Window Functions
Aggregate Offsets
Nested Aggregate Functions
Rules and Guidelines for Window Functions
Windowing Examples
Financial Plans Example
GPS Pings Example
Aggregate Function as Window Function Example
Monitoring Mappings in the Hadoop Environment
Monitoring Mappings in the Hadoop Environment Overview
Hadoop Environment Logs
YARN Web User Interface
Accessing the Monitoring URL
Viewing Hadoop Environment Logs in the Administrator Tool
Monitoring a Mapping
Blaze Engine Monitoring
Blaze Job Monitoring Application
Blaze Summary Report
Time Taken by Individual Segments
Mapping Properties
Tasklet Execution Time
Selected Tasklet Information
Blaze Engine Logs
Viewing Blaze Logs
Troubleshooting Blaze Monitoring
Spark Engine Monitoring
Spark Engine Logs
Viewing Spark Logs
Hive Engine Monitoring
Hive Engine Logs
Mappings in the Native Environment
Mappings in the Native Environment Overview
Data Processor Mappings
HDFS Mappings
HDFS Data Extraction Mapping Example
Hive Mappings
Hive Mapping Example
Social Media Mappings
Twitter Mapping Example
Profiles
Profiles Overview
Native Environment
Hadoop Environment
Column Profiles for Sqoop Data Sources
Creating a Single Data Object Profile in Informatica Developer
Creating an Enterprise Discovery Profile in Informatica Developer
Creating a Column Profile in Informatica Analyst
Creating an Enterprise Discovery Profile in Informatica Analyst
Creating a Scorecard in Informatica Analyst
Monitoring a Profile
Troubleshooting
Native Environment Optimization
Native Environment Optimization Overview
Processing Big Data on a Grid
Data Integration Service Grid
Grid Optimization
Processing Big Data on Partitions
Partitioned Model Repository Mappings
Partition Optimization
High Availability
Data Type Reference
Data Type Reference Overview
Transformation Data Type Support in a Hadoop Environment
Complex File and Transformation Data Types
Avro and Transformation Data Types
Avro Union Data Type
Unsupported Avro Data Types
JSON and Transformation Data Types
Unsupported JSON Data Types
Parquet and Transformation Data Types
Parquet Union Data Type
Unsupported Parquet Data Types
Hive Data Types and Transformation Data Types
Hive Complex Data Types
Sqoop Data Types
Aurora Data Types
IBM DB2 and DB2 for z/OS Data Types
Greenplum Data Types
Microsoft SQL Server Data Types
Netezza Data Types
Oracle Data Types
Teradata Data Types
Teradata Data Types with TDCH Specialized Connectors for Sqoop
Complex File Data Object Properties
Complex File Data Objects Overview
Creating and Configuring a Complex File Data Object
Complex File Data Object Overview Properties
Compression and Decompression for Complex File Sources and Targets
Parameterization of Complex File Data Objects
Complex File Data Object Read Properties
General Properties
Ports Properties
Sources Properties
Advanced Properties
Column Projection Properties
Complex File Data Object Write Properties
General Properties
Port Properties
Sources Properties
Advanced Properties
Column Projection Properties
Function Reference
Function Support in a Hadoop Environment
Parameter Reference
Parameters Overview
Parameter Usage
Big Data Management User Guide
Big Data Management User Guide
10.2
10.5.2
10.5.1
10.5
10.4.1
10.4.0
10.2.2 HotFix 1
10.2.2 Service Pack 1
10.2.2
10.2.1
10.2 HotFix 1
Back
Next
Step 5. Monitor Jobs
Step 5. Monitor Jobs
Monitor the status of your processing jobs. You can view monitoring statistics for your processing jobs in the Monitoring tool. After your processing jobs complete you can get business intelligence and analytics from your data.
Big Data Process
Updated December 13, 2018
Download Guide
Send Feedback
Resources
Communities
Knowledge Base
Success Portal
Back to Top
Back
Next