Metadata Manager 10.4.0
- Metadata Manager 10.4.0
- All Products
When you use a Hive connection to create a physical data object with a schema name which is different from the one specified in the Connection String, you cannot use a custom query to read data from a Hive source as the mapping results might be inconsistent.
A mapping can fail on the Spark engine with Amazon EMR version 5.2.0 when the following conditions are true:
A mapping can fail on the Spark engine with MapR version 6.1.0 when the following conditions are true:
When you select a Hive table from the
Select a resourcedialog box, the tables from other databases are also selected.
You cannot use the
Show Default Schema Onlyoption in the Connection Explorer to show tables that use the default schema and to add tables from non-default schemas.
When you use the infacmd oie importObjects command or the Developer tool to import a mapping, the import might fail with the following error:
[DBPERSISTER_1005] Failed to process requested operation. This was caused by Unable to resolve entity name from Class 
A Spark mapping that contained a Java transformation produced inaccurate results when default values were assigned to datetime and decimal ports.
A Blaze mapping that contained a Union transformation failed with inaccurate partition counts.
A cluster workflow failed with an error stating "The core-site configuration and the storageProfile contain same accounts."
If you use an Update Strategy transformation in a dynamic mapping that runs on the Blaze engine and the mapping refreshes the schema in the dynamic target at run time, the mapping fails because the Write transformation does not retain the primary key.
When the Spark engine runs a mapping in which the SPN user and the impersonation user are different, the impersonation user creates a directory for temporary files in the SPN user's directory that cannot be accessed by the SPN user. The name of the directory is the name of the SPN user.
Mapping optimization fails with an out of memory error while initializing the global predicate optimization method.
Mapping with a Hive target fails with a validation error if there is a mismatch in the column name between the Informatica target widget and the actual table present in the database.
A mapping that runs on the Blaze engine fails if the following conditions are true:
When the Model Repository Service loses connection to the Data Integration Service, the logs associated to this session appear in the FINE trace level instead of the WARN or ERROR severity level.
Mappings produced unexpected results when a source table included an invalid column name.
When a mapping runs on the Blaze engine, the mapping might fail while moving staging data to a Hive target.
If the Spark engine runs the INSTR function before it multiplies decimals, the mapping fails to run due to the following error:
Mappings fail with a file not found error. The HDFS Directory read does not prepend the default user directory on the Spark engine, when the relative directory is configured in the FF-HDFS Object.
When the Databricks Spark engine runs a mapping that contains a Lookup transformation, the mapping intermittently fails with type mismatch error.
A mapping that reads from a Redshift source and writes to a S3 target fails with an OutOfBounds exception. You might see an error like:
WARNING: [LDTM_6009] Skipping translator [ImfSparkAppSdkTargetTxTranslator] because it encountered the following exception: (lineitem_tbl_Write, SparkEngine) = java.lang.IndexOutOfBoundsException
Mappings fail intermittently with a NullPointerException in CyclicDependencyResolver.
If you specify a compression codec in a custom query, the Blaze engine fails to compress HDFS files using the codec on every Hadoop distribution except Hortonworks HDP 3.1.
A mapping with flat file sources and targets that uses the Spark engine to run on a WANdisco-enabled Hortonworks HDP 2.6.5 cluster fails.
When you run a profile from the Developer tool with the default Data Integration Service selected in the Run Configuration preferences, the profile fails to determine which mapping configuration to use with the following error:
A mapping failed when the FFTargetEscapeQuote flag was set.
After you upgrade to version 10.2.2, previewing data in a dynamic mapping created before the upgrade that reads from a .dat file which contains null characters and that is configured to get object columns from the data source at run time fails with the following error:
When you run multiple profiling jobs in parallel with the Data Integration Service property
ExecutionContextOptions.AutoInstallEnableMd5OutputFileset to true, the Data Integration Service creates duplicate file entries in the MD5 output file.
When the Blaze engine runs a mapping with a Hive source, the mapping fails intermittently with a null pointer exception. The exception occurs when the Hive custom query
“SHOW LOCKS <table name> EXTENDED”results in entries with null values.
When you perform data preview on a complex file reader object created from an Intelligent Structure Discovery model, the data preview job fails with the following error:
If a mapping on the Blaze engine reads data from a Hive table and you specify database names in both the Data Access Connection String and in the runtime properties, a SQL override uses the database in the Hive connection instead of the database in the data object.
If a mapping on the Spark engine reads data from a Hive table and you specify the database name in the data object, a SQL override uses the database in the Hive connection instead of the database in the data object.
In version 10.2.2, if a mapping configured to run on the Blaze or Spark engine fails, the mapping then attempts to run on the Hive engine and fails with the following error:
The AWS cloud provisioning connection does not accept authentications from multiple security groups.
If the Python transformation runs without using Jep, the Python binaries that are installed on the Data Integration Service machine are not resolved at run time.
The Data Integration Service fails with an out of memory error when you run Spark mappings concurrently.
The Developer tool is unable to copy values from the data viewer.
Mapping with bucketed Hive tables using Avro files with AWS S3 buckets fails with the following error: org.apache.hadoop.hive.serde2.SerDeException: Encountered exception determining schema. Returning signal schema to indicate problem: null.
The number of nodes is incorrectly displayed in the session log for the mappings that run on nodes that are labeled in a cluster that runs on Blaze and Spark engines.
When the Spark engine runs a mapping that reads from a Hive source and uses an SQL override query but is not configured to push custom queries to a database, the Spark execution plan creates views in the source database instead of the staging database.
When the Spark engine processes an input value of zero in a decimal port that is configured with equivalent precision and scale, the engine treats the value as data overflow and the return value is NULL.
The limitation is tracked using BDM-28598 for a Hortonworks HDP 3.1 cluster.
Cannot get Spark monitoring statistics for a mapping run that uses any of the following connections: Google BigQuery, Google Cloud Storage, Google Cloud Spanner, and Google Analytics.
The Spark engine put incorrect timestamps on some mappings.
Implicit data conversions performed in dynamic mappings do not appear in the mapping logs.
A mapping that reads a large number of reference tables may take longer than expected to run on the Spark engine. The issue is observed when the mapping includes transformations that collectively read 140 reference tables.
Running a mapping on the Spark engine when a cluster is secured with Kerberos results in authentication errors.
When you run a mapping that includes an Expression transformation that uses the SYSTIMESTAMP function in an output, SYSTIMESTAMP returns the same value for every row processed if the argument of the function is a variable whose expression is constant.