Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Connections
  4. Mappings in a Hadoop Environment
  5. Mapping Objects in a Hadoop Environment
  6. Mappings in the Native Environment
  7. Profiles
  8. Native Environment Optimization
  9. Data Type Reference
  10. Function Reference
  11. Parameter Reference

Troubleshooting a Mapping in a Hadoop Environment

Troubleshooting a Mapping in a Hadoop Environment

When I run a mapping with a Hive source or a Hive target on a different cluster, the Data Integration Service fails to push the mapping to Hadoop with the following error:
Failed to execute query [exec0_query_6] with error code [10], error message [FAILED: Error in semantic analysis: Line 1:181 Table not found customer_eur], and SQL state [42000]].
When you run a mapping in a Hadoop environment, the Hive connection selected for the Hive source or Hive target, and the mapping must be on the same Hive metastore.
When I run a mapping with MapR 2.1.2 distribution that processes large amounts of data, monitoring the mapping from the Administrator tool stops.
You can check the Hadoop task tracker log to see if there a timeout that results in Hadoop job tracker and Hadoop task tracker losing connection. To continuously monitor the mapping from the Administrator tool, increase the virtual memory to 640 MB in the hadoopEnv.properties file. The default is 512 MB. For example,
infapdo.java.opts=-Xmx640M -XX:GCTimeRatio=34 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ParallelGCThreads=2 -XX:NewRatio=2 -Djava.library.path=$HADOOP_NODE_INFA_HOME/services/shared/bin:$HADOOP_NODE_HADOOP_DIST/lib/native/Linux-amd64-64 -Djava.security.egd=file:/dev/./urandom -Dmapr.library.flatclass
When I run a mapping with a Hadoop distribution on MapReduce 2, the Administrator tool shows the percentage of completed reduce tasks as 0% instead of 100%.
Verify that the Hadoop jobs have reduce tasks.
When the Hadoop distribution is on MapReduce 2 and the Hadoop jobs do not contain reducer tasks, the Administrator tool shows the percentage of completed reduce tasks as 0%.
When the Hadoop distribution is on MapReduce 2 and the Hadoop jobs contain reducer tasks, the Administrator tool shows the percentage of completed reduce tasks as 100%.


Updated July 03, 2018