Table of Contents

Search

  1. Abstract
  2. Installation and Upgrade
  3. 10.1.1 Fixed Limitations and Closed Enhancements
  4. 10.1.1 Known Limitations
  5. Informatica Global Customer Support

Big Data Known Limitations

The following table describes known limitations:
Bug
Description
PLAT-14325
You cannot run a mapping in the native environment when the following conditions are true:
  • You select a native validation environment and a Hive or Blaze validation environment for the mapping.
  • The mapping contains a Match transformation.
PLAT-13744
When you use Sqoop and define a join condition in the custom query, the mapping fails. (457397)
PLAT-13738
When you use Sqoop and join two tables that contain a column with the same name, the mapping fails. (457072)
PLAT-13735
When you use Sqoop and the first mapper task fails, the subsequent mapper tasks fail with the following error message:
File already exists
(456884)
PLAT-13734
The Developer tool allows you to change an Avro data type in a complex file object to one that Avro does not support. As a result, mapping errors occur at run time.
Workaround: If you change an Avro data type, verify that it is a supported type. (456866)
PLAT-13732
When you use Sqoop to import data from an Aurora database by using the MariaDB JDBC driver, the mapping stops responding. (456704)
PLAT-13731
When you export data through Sqoop and there are primary key violations, the mapping fails and bad records are not written to the bad file. (456616)
PLAT-13722
When you export data to a Netezza database through Sqoop and the database contains a column of the float data type, the mapping fails. (456285)
PLAT-13702
Sqoop does not read the OraOop arguments that you configure in the
oraoop-site.xml
file.
Workaround: Specify the OraOop arguments as part of the Sqoop arguments in the mapping. (455750)
PLAT-13666
When you use Sqoop for a data object and update its properties in the associated Read or Write transformation, the mapping terminates with an IVector error message.
Workaround: Create a new data object and mapping. (453097)
PLAT-13652
When you enable Sqoop for a data object and a table or column name contains Unicode characters, the mapping fails. (452114)
PLAT-12073
Mappings that read from one of the following sources fail to run in the native environment when the Data Integration Service is configured to run jobs in separate remote processes:
  • Flat file or complex file in the Hadoop Distributed File System (HDFS)
  • HIVE table
  • HBase table
Workaround: On the Compute view for the Data Integration Service, configure the INFA_HADOOP_DIST_DIR environment variable for each node with the compute role. Set the environment variable to the same value configured for the Data Integration Service Hadoop Distribution Directory execution option for the Data Integration Service. (443164)
PLAT-8729
If you configure MapR 5.1 on SUSE 11 and run a Sqoop mapping on a Hadoop cluster, the mapping fails with the following error:
com.mapr.security.JNISecurity.SetClusterOption(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)Isqoop
OCON-6758
When you run a Sqoop mapping on the Blaze engine to import data from multiple sources and the join condition contains an OR clause, the mapping fails.
OCON-6756
In a Sqoop mapping, if you add a Filter transformation to filter timestamp data from a Teradata source and export the data to a Teradata target, the mapping runs successfully on the Blaze engine. However, the Sqoop program does not write the timestamp data to the Teradata target.
OCON-6745
When you use a JDBC connection in a mapping to connect to a Netezza source that contains the Time data type, the mapping fails to run on the Blaze engine.
OCON-1316
The Union transformation produces incorrect results for Sqoop mappings that you run on the Hortonworks distribution by using the TEZ engine. (460889)
OCON-1267
The path of the resource file in a complex file object appears as a recursive path of directories starting with the root directory and ending with a string. (437196)
OCON-1100
When you export data to an IBM DB2 z/OS database through Sqoop and do not configure the batch argument, the mapping fails.
Workaround: Configure the batch argument in the mapping and run the mapping again. (459671)
OCON-937
When you use an ODBC connection to write time data to a Netezza database, the mapping fails. This issue occurs when you run the mapping on Cloudera 5u4. (440423)
OCON-688
When you enable Sqoop for a logical data object and export data to an IBM DB2 database, the Sqoop export command fails. However, the mapping runs successfully without any error. (456455)
IDE-1689
Mappings and profiles that use snappy compression fail in HiveServer2 mode on HDP and CDH SUSE clusters.
Workaround:
On the Informatica domain, edit the property that contains the location of the cluster native library:
  1. Back up the following file, then open it for editing:
    <Informatica Installation Directory>/services/shared/hadoop/<Hadoop_distribution_name>_<version_number>/infaConf/hadoopEnv.properties
  2. Find the $HADOOP_NODE_HADOOP_DIST/lib/native property, and replace the value with the location of the cluster native library.
    Hortonworks example:
    /usr/hdp/2.4.2.0-258/hadoop/lib/native
    Cloudera example:
    /opt/cloudera/parcels/CDH/lib/hadoop/lib/native
On the Hadoop cluster:
  1. Open the
    HiveServer2_EnvInfa.txt
    file for editing.
  2. Change the value of
    <Informatica distribution home>/services/shared/hadoop/<Hadoop_distribution>/lib/native
    to the location of the cluster native library.
  3. Copy the contents of the
    HiveServer2_EnvInfa.txt
    file.
  4. Open the
    hive-env.sh
    file for editing, and paste the entire contents of the
    HiveServer2_EnvInfa.txt
    file.
(452819)
BDM-4652
Sqoop mappings fail with a null pointer exception on the Spark engine if you do not configure the Spark HDFS staging directory in the Hadoop connection.
BDM-4598
If the Data Integration Service becomes unavailable while running mappings with Hive sources and targets on the Blaze engine, the lock acquired on a Hive target table may fail to be released.
Workaround: Connect to Hive using a Hive client such as the Apache Hive CLI or Hadoop Hive Beeline, and then use the
UNLOCK TABLE <table_name>
command to release the lock.
BDM-4473
The Data Integration Service fails with out of memory errors when you run a large number of concurrent mappings on the Spark engine.
Workaround: Increase the heap memory settings on the machine where the Data Integration Service runs
BDM-4471
In a Hortonworks HDP or an Azure HDInsight environment, a mapping that runs on the Hive engine enabled for Tez loads only the first data table to the target if the mapping contains a Union transformation.
Workaround: Run the mapping on the Hive engine enabled for MapReduce.
BDM-4323
If an SQL override in the Hive source contains a DISTINCT or LIMIT clause, the mapping fails on the Spark engine.
BDM-4230
If the Blaze Job Monitor starts on a node different from the node that it last ran on, the Administrator tool displays the Monitoring URL of the previous node.
Workaround: Correct the URL with the current job monitor host name from the log. Or restart the Grid Manager to correct the URL for the new jobs that start.
BDM-4137
If a Sqoop source or target contains a column name with double quotes, the mapping fails on the Blaze engine. However, the Blaze Job Monitor incorrectly indicates that the mapping ran successfully and that rows were written into the target.
BDM-4107
If a mapping or workflow contains a parameter, the mapping does not return system-defined mapping outputs when run in the Hadoop environment.
BDM-3989
Blaze mappings fail with the error "The Integration Service failed to generate the grid execution plan for the mapping" when any of the following conditions are true:
  • The Apache Ranger KMS is not configured correctly on a Hortonworks HDP cluster.
  • The Hadoop KMS is not configured correctly for HDFS transparent encryption on a Cloudera CDH cluster.
  • The properties hadoop.kms.proxyuser.<SPN_user>.groups and hadoop.kms.proxyuser.<SPN_USER>.hosts for the Kerberos SPN are not set on the Hadoop cluster.
BDM-3981
When you run a Sqoop mapping on the Blaze engine to export Netezza numeric data, the scale part of the data is truncated.
BDM-3853
When the Blaze engine runs a mapping that uses source or target files in the WASB location on a cluster, the mapping fails with an error like:
java.lang.RuntimeException: [<error_code>] The Integration Service failed to run Hive query [exec0_query_6] for task [exec0] due to following error: <error_code> message [FAILED: ... Cannot run program "/usr/lib/python2.7/dist-packages/hdinsight_common/decrypt.sh": error=2, No such file or directory], ...
The mapping fails because the cluster attempts to decrypt the data but cannot find a file needed to perform the decryption operation.
Workaround: Find the following files on the cluster and copy them to the
/usr/lib/python2.7/dist-packages/hdinsight_common
directory on the machine that runs the Data Integration Service:
  • key_decryption_cert.prv
  • decrypt.sh
BDM-3779
Sqoop mappings fail on the Blaze engine if there are unconnected ports in a target. This issue occurs when you run the Sqoop mapping on any cluster other than a Cloudera 5.8 cluster.
Workaround: Before you run the mapping, create a table in the target database with columns corresponding to the connected ports.
BDM-3744
When a Hadoop cluster is restarted without stopping the components of the Blaze engine, stale Blaze processes remain on the cluster.
Workaround: Kill the stale processes using the pkill command.
BDM-3687
When you run a Sqoop mapping on the Spark engine, the Sqoop map-reduce jobs run in the default yarn queue instead of the yarn queue that you configure.
Workaround: To run a map-reduce job in a particular yarn queue, configure the following property in the
Sqoop Arguments
field of the JDBC connection:
-Dmapreduce.job.queuename=<NameOfTheQueue>
To run a Spark job in a particular yarn queue, configure the following property in the Hadoop connection:
spark.yarn.queue=<NameOfTheQueue>
BDM-3635
When you run a Sqoop mapping and abort the mapping from the Developer tool, the Sqoop map-reduce jobs continue to run.
Workaround: On the Sqoop data node, run the following command to kill the Sqoop map-reduce jobs:
yarn application -kill <application_ID>
BDM-3544
When the proxy user setting is not correctly configured in core-site.xml, a mapping that you run with the Spark engine hangs with no error message.
Workaround: Set the value of the following properties in core-site.xml to “*” (asterisk):
  • hadoop.proxyuser.<Data Integration Service user name>.groups
  • hadoop.proxyuser.<Data Integration Service user name>.hosts
BDM-3416
When you run a mapping on a cluster where Ranger KMS authorization is configured, the mapping fails with an "UndeclaredThrowableException" error.
To address this issue, choose one of the following workarounds:
  • If the cluster uses Ranger KMS for authorization, and the mapping which access the encryption zone, verify that the dfs.encryption.key.provider.uri property is correctly configured in hive-site.xml or hdfs-site.xml.
  • If the cluster does not use Ranger KMS, and you still encounter this issue, remove the dfs.encryption.key.provider.uri property from hive-site.xml and hdfs-site.xml.
BDM-3303
When you run a Sqoop mapping on the Blaze engine and the columns contain Unicode characters, the Sqoop program reads them as null values.
BDM-3267
On a Blaze engine, when an unconnected Lookup expression is referenced in a join condition, the mapping fails if the master source is branched and the Joiner transformation is optimized with a map-side join. The mapping fails with the following error: [TE_7017] Internal error. Failed to initialize transformation [producer0]. Contact Informatica Global Customer Support.
BDM-3228
A user who is not in the Administrator group, but who has the privileges and permissions to access the domain and its services, does not have access to the Rest application properties in the Administrator tool when the applications are deployed by another user.
BDM-2641
When mappings fail, the Spark engine does not drop temporary Hive tables used to store data during mapping execution. You can manually remove the tables. (450507)
BDM-2222
The Spark engine does not run the footer row command configured for a flat file target. (459942)
BDM-2181
The summary and detail statistics empty for mappings run on Tez. (452224)
BDM-2141
Mapping with a Hive source and target that uses an ABS function with an IIF function fails in the Hadoop environment. (424789)
BDM-2137
Mapping in the Hadoop environment fails when it contains a Hive source and a filter condition that uses the default table name prefixed to the column name.
Workaround: Edit the filter condition to remove the table name prefixed to the column name and run the mapping again. (422627)
BDM-2136
Mapping in the Hadoop environment fails because the Hadoop connection uses 128 characters in its name. (421834)
BDM-1423
Sqoop mappings that import data from or export data to an SSL-enabled database fail on the Blaze engine.
BDM-1271
If you define an SQL override in the Hive source and choose to update the output ports based on the custom query, the mapping fails on the Blaze engine.
BDM-960
Mappings with an HDFS connection fail with a permission error on the Spark and Hive engines when all the following conditions are true:
  • The HDFS connection user is different from the Data Integration Service user.
  • The Hadoop connection does not have an impersonation user defined.
  • The Data Integration Service user does not have write access to the HDFS target folder.
Workaround: In the Hadoop connection, define an impersonation user with write permission to access the HDFS target folder.