Connections and Cluster Distributions that Support Data Preview
Connections and Cluster Distributions that Support Data Preview
When you preview data using the Spark engine, configure the mapping with a supported connection and Hadoop distribution.
Connections
You can use the Spark engine to preview data on mappings that use the following connections:
HBase
HDFS
Hive
JDBC configured for Sqoop
When you configure a mapping that uses a JDBC connection, you can use a generic JDBC connection, or use the specialized drivers for Oracle or Teradata.
Cluster Distributions
You can preview data in mappings configured to run with the following distributions:
Amazon EMR
Azure HDInsight
Cloudera CDH
Cloudera CDP
Hortonworks HDP
MapR
Before you preview data on Amazon EMR, you must configure the
/etc/hosts
file on all nodes in the cluster to include the machine and IP address of the Data Integration Service.