Hive Connector

Back Next

Column partitioning for targets

You can organize tables or data sets into partitions to group the same type of data based on a column.

You can use an existing Hive target that has partitioned columns to write the data or you can configure partitions when you create a new Hive target at runtime.

When you create a new target at runtime, you can select the incoming columns that you want to add as partition columns. Include the partition columns from the list of fields that display in Partition Fields
on the

Partitions

tab in the Target transformation.

When you select the columns as partitioned columns, the default data type for the columns is set to String. You cannot edit the data type of a partitioned column in the Hive target object. You can add, delete, and change the order of the partition fields, if required. You must not give the same partition order for multiple columns.

You can also create buckets to divide large data sets into more manageable parts. To configure buckets, you must first specify the number of buckets when you create a new target at runtime. After you specify the number, select the bucket fields, and then select the partitioning fields.

Data Integration creates all the fields that you select for partitioning in the target based on the partition order you specify. For example, you can create a table to write employee joining details categorized in the following hierarchical order, such as month, hour, year, and date.

Hive targets in mappings

Adding columns as partitions to the target

Rules and guidelines for adding partitioning columns

Download Guide

Watch

Comments

Cloud Connectivity Homepage

Communities

Knowledge Base

Success Portal