Table of Contents

Search

  1. Preface
  2. Part 1: Hadoop Integration
  3. Part 2: Databricks Integration
  4. Appendix A: Connections Reference

Verify Run-time Drivers

Verify Run-time Drivers

Verify run-time drivers for mappings that access JDBC-compliant databases in the Hadoop environment. Use any Type 4 JDBC driver that the database vendor recommends.
  1. Download Type 4 JDBC drivers associated with the JCBC-compliant databases that you want to access.
  2. To use Sqoop TDCH Cloudera Connector Powered by Teradata, download all the .jar files in the Cloudera Connector Powered by Teradata package from the following location: http://www.cloudera.com/downloads.html.
    The package has the following naming convention: sqoop-connector-teradata-<version>.tar
    To use the Cloudera CDP version 7.x, you must download the sqoop-connector-teradata-1.8.0c7.jar file.
  3. Append
    /opt/cloudera/parcels/CDH/lib/hive/lib/hive-exec-{version}.jar
    to the mapreduce.application.classpath property in the mapred-site.xml file.
  4. To optimize the Sqoop mapping performance on the Spark engine while writing data to an HDFS complex file target of the Parquet format, download the following .jar files:
    File name
    Location
    parquet-hadoop-bundle-1.6.0.jar
    parquet-avro-1.6.0.jar
    parquet-column-1.5.0.jar
  5. Copy all of the .jar files to the following directory on the machine where the Data Integration Service runs:
    <Informatica installation directory>\externaljdbcjars
    Changes take effect after you recycle the Data Integration Service. At run time, the Data Integration Service copies the .jar files to the Hadoop distribution cache so that the .jar files are accessible to all nodes in the cluster.