Table of Contents

Search

  1. Preface
  2. Introduction to Mass Ingestion
  3. Prepare
  4. Create
  5. Deploy
  6. Run
  7. Monitor
  8. infacmd mi Command Reference

Mass Ingestion Guide

Mass Ingestion Guide

Configuring a Hive Target

Configuring a Hive Target

Configure a Hive target to ingest data to a Hive table. When you configure the mass ingestion specification to ingest data to a Hive target, you configure a Hive connection and Hive properties to define the target.
You can ingest data to an internal or external Hive table. Internal Hive tables are managed by Hive. External Hive tables are unmanaged tables. You can specify an external location for an external Hive table such as Amazon S3, Azure Blob, HDFS, WASB, or ADLS.
The following image shows the Target page for a Hive target:
This screenshot shows the Target page of the mass ingestion specification for a Hive target. On the Target page, you can configure properties to define the Hive target. The table below this image lists the properties that you can configure. There is a note at the bottom of the page that says "Partition and cluster options are available on the Table Parameters page." In the top-right corner, you have the option Next to go to the next page, or the button X to discard the specification.
The following table describes the properties that you can configure to define the Hive target:
Property
Description
Target Connection
Required. The Hive connection used to find the Hive storage target.
If changes are made to the available Hive connections, refresh the browser or log out and log back in to the Mass Ingestion tool.
Target Schema
Required. The schema that defines the target tables.
Target Table Prefix
The prefix added to the names of the target tables.
Enter a string. You can enter alphanumeric and underscore characters. The prefix is not case sensitive.
Target Table Suffix
The suffix added to the names of the target tables.
Enter a string. You can enter alphanumeric and underscore characters. The prefix is not case sensitive.
Hive Options
Select this option to configure the Hive target location.
DDL Query
Select this option to configure a custom DDL query that defines how data from the source tables is loaded to the target tables.
Storage Format
Required. The storage format of the target tables. You can select Text, Avro, Parquet, or ORC. Default is Text.
External Table
Select this option if the table is external.
External Location
The external location of the Hive target. By default, tables are written to the default Hive warehouse directory.
A sub-directory is created under the specified external location for each source that is ingested. For example, you can enter
/temp
. A source table named
PRODUCT
is ingested to the external location
/temp/PRODUCT/
Configure partition and cluster properties for specific target Hive tables when you configure the transformation override.

0 COMMENTS

We’d like to hear from you!