Table of Contents

Search

  1. Preface
  2. Introduction to Hive Connector
  3. Hive connections
  4. Mappings and mapping tasks with Hive Connector
  5. Migrating a mapping
  6. Data type reference
  7. Troubleshooting

Hive Connector

Hive Connector

Column partitioning for targets

Column partitioning for targets

You can organize tables or data sets into partitions to group the same type of data based on a column.
You can use an existing Hive target that has partitioned columns to write the data or you can configure partitions when you create a new Hive target at runtime.
When you create a new target at runtime, you can select the incoming columns that you want to add as partition columns. Include the partition columns from the list of fields that display in
Partition Fields
on the
Partitions
tab in the Target transformation.
When you select the columns as partitioned columns, the default data type for the columns is set to String. You cannot edit the data type of a partitioned column in the Hive target object. You can add, delete, and change the order of the partition fields, if required. You must not give the same partition order for multiple columns.
You can also create buckets to divide large data sets into more manageable parts. To configure buckets, you must first specify the number of buckets when you create a new target at runtime. After you specify the number, select the bucket fields, and then select the partitioning fields.
Data Integration creates all the fields that you select for partitioning in the target based on the partition order you specify. For example, you can create a table to write employee joining details categorized in the following hierarchical order, such as month, hour, year, and date.

0 COMMENTS

We’d like to hear from you!