A mapping run in the Hadoop environment can write to a Hive target. When you write to a Hive target, consider the processing differences for functionality such as table types, DDL queries, and bucketing.
Hive Table Types
A Hive target can be an internal table or an external table. Internal Hive tables are
managed by Hive and are also known as managed tables. External Hive tables are
tables managed by an external source such as HDFS, Amazon S3, or Microsoft Azure
Blob Storage.
When a
mapping creates or replaces a Hive table, the type of table that the mapping creates
depends on the run-time engine that you use to run the mapping:
On the Blaze engine, mappings
create managed tables.
On the Spark engine, mappings
create external tables.
A mapping that runs on the Spark engine supports Hive tables in Apache Iceberg format on
Cloudera Data Platform (CDP).
DDL Queries
For mappings that run on the Spark engine or the Blaze engine, you can create a
custom DDL query that creates or replaces a Hive table at run time. However, with
the Blaze engine, you cannot use a backtick (`) character in the DDL query. The
backtick character is required in HiveQL when you include special characters or
keywords in a query.
Bucketing
The Spark engine can write to bucketed Hive targets. Bucketing and partitioning of
Hive tables can improve performance by reducing data shuffling and sorting.
Hortonworks HDP 3.1
Hortonworks HDP 3.1 uses ACID-enabled ORC targets by default and uses ACID for all
managed tables. If you do not want to use ACID-enabled tables, use external tables.