Big Data Release Notes

Big Data Release Notes

Third-Party Known Limitations

Third-Party Known Limitations

The following table describes known limitations:
Bug
Description
OCON-9943
If you configure Sqoop to import time data from a Netezza database, the mapping fails.
Apache ticket reference number: SQOOP-2978
OCON-8786
If you configure Sqoop to export data of the Clob or DBClob data type to IBM DB2 z/OS targets, the mapping fails.
OCON-8561
If you configure Sqoop to export data of the Money data type to Microsoft SQL Server targets, the mapping fails.
OCON-8387
If you configure TDCH and Sqoop and run a mapping on the Blaze or Spark engine to export data of the Time data type, only milliseconds are written to the target. The nanosecond part is truncated.
Cloudera ticket reference number: 124306
OCON-8332
If you configure Sqoop to export data of the Clob or DBClob data type to IBM DB2 targets, the mapping fails.
OCON-7974
If you configure Sqoop and a column name contains spaces, the mapping fails.
Apache ticket reference number: SQOOP-2737
OCON-7620
If you import data from an IBM DB2 source through Sqoop and the table name contains mixed case characters, the mapping fails.
Sqoop JIRA issue number: SQOOP-3211
OCON-7505
Sqoop mappings that read byte or varbyte data from a Teradata source and write it to a Teradata target fail on the Blaze engine. This issue occurs if you use Cloudera Connector Powered by Teradata.
Cloudera ticket reference number: 124305
OCON-7504
When you use Sqoop to read data of the Timestamp data type from a Teradata source and write it to a Teradata target, only milliseconds are written to the target. This issue occurs if you run the Teradata Parallel Transporter mapping on a Cloudera cluster and on the Blaze engine.
Cloudera ticket reference number: 124302
OCON-7503
When you use Sqoop to read time data from a Teradata source and write it to a Teradata target, the fractional seconds get corrupted. This issue occurs if you use Cloudera Connector Powered by Teradata or Hortonworks Connector for Teradata, and run the mapping on the Blaze engine.
Cloudera ticket reference number: 124306
OCON-7459
When you export data to an IBM DB2 target through Sqoop, the mapping fails if all of the following conditions are true:
  • You create or replace the IBM DB2 target table at run time.
  • The IBM DB2 target table name or column names contain mixed case characters.
  • You run the mapping on a Cloudera 5u8 cluster.
Apache ticket reference number: SQOOP-3212
OCON-7431
When you read time data from a Teradata source and write it to a Teradata target, the fractional seconds get corrupted. This issue occurs if you run the Teradata Parallel Transporter mapping on a Hortonworks cluster and on the Blaze engine.
Cloudera ticket reference number: 124302
OCON-7219
When you run a Sqoop mapping on the Blaze engine to export Teradata float data, the data is truncated after the decimal point.
Cloudera support ticket number: 113716
OCON-7214
Sqoop mappings fail on the Blaze engine if you use a custom query with the Order By clause to import data.
Sqoop JIRA issue number: SQOOP-3064
OCON-7213
The Sqoop program does not honor the --num-mappers argument and -m argument when you export data and run the mapping on the Blaze or Spark engine.
Sqoop JIRA issue number: SQOOP-2837
OCON-7211
When you run a Sqoop mapping to import data from or export data to Microsoft SQL Server databases that are hosted on Azure, the mapping fails.
Sqoop JIRA issue number: SQOOP-2349
OCON-2847
Loading a Microsoft SQL Server resource fails when TLS encryption is enabled for the source database and the Metadata Manager repository is a Microsoft SQL Server database with TLS encryption enabled. (452471)
Data Direct case number: 00343832
OCON-1100
When you export data to an IBM DB2 z/OS database through Sqoop and do not configure the batch argument, the mapping fails.
Workaround: Configure the batch argument in the mapping and run the mapping again. (459671)
Apache ticket reference number: SQOOP-2980
IIS-2719
If the mapping contains a passive Lookup transformation that is configured with an inequality lookup condition, the target receives the data later.
IIS-2484
When you write to a HBase target in an Amazon EMR cluster and the spark.hadoop.validateOutputSpecs property is set to true, the streaming mapping fails with an exception.
Workaround: Set the spark.hadoop.validateOutputSpecs property to false.
HBASE JIRA issue number: 20295
BDM-9585
Mappings fail on the Spark engine when you configure an SQL override to access a Hive view.
Apache Spark ticket reference number: SPARK-21154.
BDM-23420
A mapping with a Sorter transformation that contains datetime data in its default value changes the data when you run the mapping on a MapR cluster.
MapR ticket reference number: 00072094
BDM-23104
The Spark engine cannot write data to a bucketed Hive target if the Hadoop distribution is MapR.
MapR case number: 00074338
BDM-22675
Incorrect source statistics appear when the Spark engine processes a Union transformation.
Spark ticket reference number: SPARK-26683
BDM-22318
If a parameter default value has a newline character, the Spark engine creates additional rows and can shift column values to other columns.
Workaround: To resolve the issue with newline characters, use a SequenceFile format instead of a TextFile format for the target table.
BDM-17470
In an Azure HDInsight environment, if you enable Hive merge in an Update Strategy transformation and if Hive is enabled to execute vectorized queries, inserting data into specific columns fails.
Workaround: In the hive-site.xml on the cluster, set the hive.vectorized.execution.enabled property to false.
Apache Hive ticket reference number: HIVE-14076
BDM-17204
If the impersonation user tries to run a mapping with HDFS source and does not have the DECRYPT_EEK privilege to read the file, the log file shows an incorrect error message.
HADOOP-12604
BDM-17020
When you run a mapping that uses a schema in an Avro file, the Spark engine adds a NULL data type to the primitive data types in the schema.
BDM-14438
If a Parquet file has an array of the format [map (struct, any)], then the mapping on the Spark engine fails with a spark.sql.AnalysisException exception.
SPARK-22474
BDM-14422
The mapping fails with an error on Spark engine due to duplicate columns in the Hive table.
SPARK-23519
BDM-14410
The mapping fails because the Spark engine cannot read from an empty ORC Hive source.
SPARK-19809
BDM-13650
Mappings fail on the Spark engine when the Spark engine runs on a secure HA cluster and the hadoop.security.auth_to_local property in the core-site.xml file contains a modified value. The mappings fail due to the following error:
Failed to renew token
Workaround: Configure the following property in the yarn-site.xml file using any node in the Hadoop cluster:
yarn.resourcemanager.address=<node name>
MapR ticket reference number: MAPREDUCE-6484
BDM-10570
The Spark job fails with out of memory errors when a mapping that converts relational data to hierarchical data contains more than three Aggregator and Joiner transformations.
Workaround: To convert relational data to a hierarchical data of more than four levels, develop more than one mapping to stage the intermediate data. For example, develop a mapping that converts relational data to a hierarchical data up to three levels. Use the hierarchical data in another mapping to generate a hierarchical data of four levels.
SPARK-22207
BDM-10455
Inserts into a bucketed table can sometimes fail when you use Hive on Tez as the execution engine. The issue is more probable if the table is a Hive ACID table and a delete operation is performed before the inserts.
Apache ticket reference number: TEZ-3814

0 COMMENTS

We’d like to hear from you!