Table of Contents

Search

  1. Abstract
  2. Installation and Upgrade
  3. 10.1.1 HotFix 1 Fixed Limitations and Closed Enhancements
  4. 10.1.1 HotFix 1 Known Limitations
  5. 10.1.1 Update 2 Fixed Limitations and Closed Enhancements
  6. 10.1.1 Update 2 Known Limitations
  7. 10.1.1 Update 1 Fixed Limitations and Closed Enhancements
  8. 10.1.1 Update 1 Known Limitations
  9. 10.1.1 Fixed Limitations and Closed Enhancements
  10. 10.1.1 Known Limitations
  11. Informatica Global Customer Support

Big Data Known Limitations (10.1.1 Update 2)

Big Data Known Limitations (10.1.1 Update 2)

The following table describes known limitations:
Bug
Description
OCON-8382
When you read data through Sqoop and the source contains numeric data types with a precision greater than 15, the target data is truncated after 15 digits. This issue occurs when you run the mapping on the Spark engine.
OCON-8353
When a mapping reads from or writes to an HDFS on a secure cluster, the mapping log incorrectly states that the Data Integration Service used the HDFS connection user to connect to HDFS. In fact, the Data Integration Service uses the following users, in order of preference, to connect to HDFS:
  1. Operating system profile user, if configured.
  2. Impersonation user, if configured.
  3. HDFS connection user, if configured.
  4. Data Integration Service user.
The mapping log is correct when you read from or write to HDFS on a non-secure cluster.
OCON-7687
When you export data through Sqoop and the columns contain mixed case characters, the mapping fails.
OCON-7669
When you configure Sqoop and OraOop, and export data to an Oracle target that contains mixed case characters in the table name, the mapping fails.
Workaround: Use the generic Oracle JDBC driver to export data.
OCON-7620
When you import data from an IBM DB2 source through Sqoop and the table name contains mixed case characters, the mapping fails.
OCON-7521
Column profile run fails when the following conditions are true:
  1. You use the Cloudera Connector Powered by Teradata for Sqoop to read data from Teradata for a Sqoop data source.
  2. You create a column profile on the Sqoop data source.
  3. You run the profile using the Blaze engine.
Workaround: Create a separate Sqoop connection for each Sqoop data source with the
-split-by
option, and run a column profile on the data source.
OCON-7459
When you export data to an IBM DB2 target through Sqoop, the mapping fails if all of the following conditions are true:
  • You create or replace the IBM DB2 target table at run time.
  • The IBM DB2 target table name or column names contain mixed case characters.
  • You run the mapping on a Cloudera 5u8 cluster.
OCON-7431
When you read time data from a Teradata source and write it to a Teradata target, the fractional seconds get corrupted. This issue occurs if you run the Teradata Parallel Transporter mapping on a Hortonworks cluster and on the Blaze engine.
OCON-7429
When you run a Teradata Parallel Transporter mapping on a Hortonworks cluster and on the Blaze engine to write byte and varbyte data to a Teradata target, the data gets corrupted. This issue occurs when you use the
hdp-connector-for-teradata-1.5.1.2.5.0.0-1245-distro.tar.gz
JAR.
Workaround: Use the
hdp-connector-for-teradata-1.4.1.2.3.2.0-2950-distro.tar.gz
JAR.
OCON-7365
Sqoop mappings fail on MapR 5.2 clusters.
Workaround: Add the following property in the mapred-site.xml file on all nodes of the cluster, and restart the Hadoop services and cluster:
<property> <name>mapreduce.jobhistory.address</name> <value><Host_Name>:10020</value> </property>
OCON-7291
Mappings that read data from a Teradata source and contain the != (not equal) operator in the filter override query fail. This issue occurs if you run the Teradata Parallel Transporter mapping on a Hortonworks cluster and on the Blaze engine.
Workaround: Use a native expression with the ne operator instead of the != operator.
OCON-7073
When you run a Sqoop mapping on a Cloudera cluster that uses Kerberos authentication, you must manually configure mapreduce properties in the
yarn-site.xml
file on the Data Integration Service node and restart the Data Integration Service. To run the mapping on the Blaze engine, you must restart the Grid Manager and Blaze Job Monitor.
LDM-3324
Column profile runs indefinitely when the following conditions are true:
  • The Hive data source for the profile resides on an Azure HDInsight cluster that uses WASB storage.
  • You create a column profile on the Hive source and enable data domain discovery.
  • You run the profile on the Hive engine in the Hadoop environment.
  • You do not use the JDBC connection as the profiling warehouse connection.
IDE-2410
When you create a column profile with data domain discovery on a Hive source and run the profile on the Hive engine in the Hadoop run-time environment set up on an Azure HDInsight cluster, the profile runs indefinitely.
IDE-2407
Column profile run fails when the following conditions are true:
  1. The profiling warehouse repository is on Microsoft SQL Server and you enable the Use DSN option to use the DSN configured in the Microsoft ODBC Administrator as the connect string.
  2. You create a column profile with data domain discovery and choose the sampling option as Random sample or Random sample (auto), or you create a column profile to perform only data domain discovery.
  3. You run the profile on the Blaze engine in the Hadoop run-time environment set up on an Azure HDInsight cluster.
BDM-7591
Mappings that read from and write to Hive sources and targets on Amazon S3 in a Hortonworks 2.3 cluster fail.
BDM-7348
When the Spark engine runs a mapping that contains a decimal data type on a Hortonworks version 2.3 or 2.4 cluster running on SUSE Linux, the mapping fails.
The following error message appears:
java.math.BigDecimal is not a valid external type for schema of int
The issue occurs due to a mismatch between the metadata that Informatica imports and the data type in the Hive table.
BDM-7347
Mappings that use the Snappy compression mode fail.
The following error message appears:
java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z ...
Workaround for the Blaze, native, and Hive run-time engines:
  1. Copy Hadoop libraries from the cluster to the corresponding folders on the machine where the Data Integration Service is installed.
  2. Edit -<Informatica installation directory>/Informatica/services/shared/hadoop/<Hadoop distribution name>_<version>/InfaConf/hadoopEnv.properties to add the following properties:
    spark.yarn.appMasterEnv.LD_LIBRARY_PATH=$INFA_HADOOP_DIST_DIR/lib/native:$LD_LIBRARY_PATH spark.executorEnv.LD_LIBRARY_PATH=$INFA_HADOOP_DIST_DIR/lib/native:$LD_LIBRARY_PATH
There is no workaround if you want to use the Spark run-time engine.
BDM-7126
Mappings that run on the Spark engine fail if you change the operating system profile user or the impersonation user.
Workaround: To run the mappings on the Spark engine with a different user, complete any one of the following tasks:
  • In the hadoopEnv.properties file, change the value of the infa.osgi.enable.workdir.reuse property to false.
  • Before changing the user, set the value of the infa.osgi.parent.workdir property in the hadoopEnv.properties file to a different working directory.
BDM-6840
A mapping executed with the Blaze engine writes an inaccurate row count to the target. The row count includes rejected rows.
BDM-6754
When the Data Integration Service is configured to run with operating system profiles and you push the mapping to an HDInsight cluster with ADLS as storage, the mapping fails with the following error:
Exception Class: [java.lang.RuntimeException] Exception Message: [java.io.IOException: No FileSystem for scheme: adl]. java.lang.RuntimeException: java.io.IOException: No FileSystem for scheme: adl
BDM-6694
When the Blaze engine reads from a compressed Hive table with text format, the mapping fails if the TBLPROPERTIES clause is not set for the Hive table.
Workaround: Create or alter the table with the TBLPROPERTIES clause. For example,
TBLPROPERTIES ('text.compression'='Snappy')
.
BDM-6598
When the Blaze engine runs a mapping on an Amazon EMR 5.0 cluster, the Blaze engine does not use the following properties set in yarn-site.xml on the Data Integration Service machine host:
  • fs.s3n.endpoint
  • fs.s3.awsAccessKeyId
  • fs.s3.awsSecretAccessKey
The Blaze engine uses the values for these properties from the yarn-site.xml file on the cluster.
BDM-6389
A mapping fails to add statistics to Hive table metadata after loading data to the table on Hortonworks.
Workaround: To view statistics for a table, run the following command on the HIVE command line:
ANALYZE TABLE <table name> COMPUTE STATISTICS;
BDM-5465
Mappings that read from or write to partitioned or bucketed Hive sources and targets on Amazon S3 take longer to execute than expected.


Updated January 17, 2019