Table of Contents

Search

  1. Version 10.2
  2. Version 10.1.1
  3. Version 10.1
  4. Version 10.0
  5. Version 9.6.1
  6. Version 9.6.0

Big Data

Big Data

This section describes new big data features in version 9.6.1 HotFix 2.

Informatica Analyst

Big Data Edition has the following new features and enhancements for the Analyst tool:
Analyst tool integration with Hadoop
Effective in version 9.6.1 HotFix 2, you can enable the Analyst tool to communicate with a Hadoop cluster on a specific Hadoop distribution. You must configure the JVM Command Line Options for the Analyst Service.
For more information, see the
Informatica 9.6.1 HotFix 2 Application Services Guide
.
Analyst tool connections
Effective in version 9.6.1 HotFix 2, you can use the Analyst tool to connect to Hive or HDFS sources and targets.
For more information, see the
Informatica 9.6.1 HotFix 2 Analyst User Guide
.

Data Warehousing

Big Data Edition has the following new features and enhancements for data warehousing:
Binary Data Type
Effective in version 9.6.1 HotFix 2, a mapping in the Hive environment can process expression functions that use binary data.
For more information, see the
Informatica 9.6.1 HotFix 2 Big Data Edition User Guide
.
Timestamp and Date Data Type
Effective in version 9.6.1 HotFix 2, PowerExchange for Hive supports the Timestamp and Date data types.
For more information, see the
Informatica 9.6.1 HotFix 2 Big Data Edition User Guide
.
File Format
Effective in version 9.6.1 HotFix 2, you can use the Data Processor transformation to read Parquet input or output.
Apache Parquet is a columnar storage format that can be processed in a Hadoop environment. Parquet is implemented to address complex nested data structures, and uses a record shredding and assembly algorithm.
For more information, see the
Informatica 9.6.1 HotFix 2 Data Transformation User Guide
.

Data Lineage

Effective in version 9.6.1 HotFix 2, you can perform data lineage analysis on big data sources and targets. You can create a Cloudera Navigator resource to extract metadata for big data sources and targets and perform data lineage analysis on the metadata.
For more information, see the
Informatica 9.6.1 HotFix 2 Metadata Manager Administrator Guide
.

Hadoop Ecosystem

Big Data Edition has the following new features and enhancements for the Hadoop ecosystem:
Hadoop Distributions
Effective in version 9.6.1 HotFix 2, Big Data Edition added support for the following Hadoop distributions:
  • Cloudera CDH 5.2
  • Hortonworks HDP 2.2
  • IBM BigInsights 3.0.0.0
  • Pivotal HD 2.1
Big Data Edition dropped support for the following Hadoop distributions:
  • Cloudera CDH 5.0
  • Cloudera CDH 5.1
  • Hortonworks HDP 2.1
  • Pivotal HD 1.1
For more information, see the
Informatica 9.6.1 HotFix 2 Big Data Edition Installation and Configuration Guide.
Effective in version 9.6.1 HotFix 2, Big Data Edition supports Cloudera CDH clusters on Amazon EC2.
Kerberos Authentication
Effective in version 9.6.1 HotFix 2, you can configure user impersonation for the native environment. Configure user impersonation to enable different users to run mappings or connect to big data sources and targets that use Kerberos authentication.
For more information, see the Informatica
9.6.1 Big Data Edition User Guide
.

Performance Optimization

Big Data Edition has the following new features for performance optimization:
Compress data on temporary staging tables
Effective in version 9.6.1 HotFix 2, you can enable data compression on temporary staging tables to optimize performance when you run a mapping in the Hive environment. When you enable data compression on temporary staging tables, mapping performance might increase.
To enable data compression on temporary staging tables, you must configure the Hive connection to use the codec class name that the Hadoop cluster uses. You must also configure the Hadoop cluster to enable compression on temporary staging tables.
For more information, see the
Informatica 9.6.1 HotFix 2 Big Data Edition User Guide
.
Parallel sort
Effective in version 9.6.1 HotFix 2, when you use a Sorter transformation in a mapping, the Data Integration Service enables parallel sorting by default when it pushes the mapping logic to the Hadoop cluster.
For more information, see the
Informatica 9.6.1 HotFix 2 Big Data Edition User Guide
.

Profile Run on Hadoop Sources in Informatica Analyst

Effective in version 9.6.1 HotFix 2, you can create and run a column profile, rule profile, and data domain discovery on Hive and HDFS sources in the Analyst tool.
For more information, see the
Informatica 9.6.1 HotFix 2 Big Data Edition User Guide
.


Updated May 28, 2019