Table of Contents

Search

  1. Preface
  2. Introduction to Hive Connector
  3. Hive connections
  4. Mappings and mapping tasks with Hive Connector
  5. Migrating a mapping
  6. Data type reference
  7. Troubleshooting

Hive Connector

Hive Connector

Rules and guidelines for Hive objects in mappings

Rules and guidelines for Hive objects in mappings

Consider the following rules and guidelines for Hive objects in mappings and
mapping
tasks:

General guidelines

When you select a connection in the mapping and leave it unused for a long duration, and then select the object in the mapping, an error occurs. You must reselect the connection in the mapping and then select the object.

Create target at runtime

When you create a target at runtime, consider the following guidelines:
  • When you re-run a mapping task, the task fails when the table is dropped at the backend. You must edit the mapping or mapping task and then run the mapping task.
  • If you create a Hive target and the data is of the RC file format, you cannot insert null values to the binary data type column.
  • You cannot write float and double data types that contain exponential data to a Hive target on the Cloudera EMR cluster.
  • When you use the Create New at Runtime option to create a Hive target, you can parameterize only the target connection and the object name using a parameter file. You cannot parameterize the other properties such as the path, location, file format, and additional properties.
  • When you configure a mapping to create a new Hive target at runtime to write data from a Kafka source, where the selected objects in the mapping are parameterized, the mapping fails with the following error:
    com.informatica.powercenter.sdk.SDKException: Invalid operation field name
  • If you enable the Alter and Apply Changes schema handling option for a mapping that creates a new target at runtime and the source column names contain upper or mixed case characters, the alter command does not trigger and the mapping fails.

Data types

  • When you run a mapping to write data to a Hive table and the columns consist of complex data types, the mapping fails.
  • You cannot preview a column in a table that uses the binary data type.
  • If the data is of the Avro file format and contains the timestamp data types, Data Integration writes the data only up to milliseconds.
  • You cannot read or write hierarchical data with decimal data types. The mapping fails with the following error:
    java.lang.RuntimeException

Parameterization

  • You can enter a value up to milliseconds in a filter that is not parameterized. For example:
    hive_timestamp_only.c.timestamp > 2085-11-06 06:24:12.018
    You cannot enter a value in nanoseconds in a filter that is not parameterized. For example:
    hive_timestamp_only.c.timestamp > 2085-11-06 06:24:12.01888
    You can use an advanced filter to enter a value in nanoseconds.
  • To enter a date without specifying the time in a filter that is not parameterized, use the format specified in the following example:
    hive_date_only.c.date > 2085-11-06 00:00:00.00
  • If the target object is parameterized and the field mapping is not set to automap, the option to create a target option does not display in the task properties. You must set the field mapping to automap when you use a parameterized object.
  • When you read from multiple databases, if the connection does not have the database name, you cannot use a parameter file from the mapping task.
  • Avoid using an existing connection to read from multiple databases in the new mapping. If you edit an existing connection to remove the database name so that you can read from multiple databases, do not use that connection for the existing mapping. If you use the same connection in the existing mapping, it fails.
  • If you parameterize the source or target connection and object and then specify an override in the advance properties with in-out parameters, and also use a parameter file in the mapping task, the data is read from the imported table at design time and not from the overridden table.

Special characters

  • You cannot import data from Hive that contains Unicode characters in the table or column names. The mapping fails to run.
  • When you read from or write data to a Hive table, ensure that the column name in the Hive table does not exceed 74 characters.
  • When you import data from Hive tables, special characters in the table or column names appear as '_' in the source object and in the data preview in the Source transformation.
  • When you import a Hive source or target table where the table or column name contains special characters such as a slash '/', the staging files do not get deleted from the staging location.

SQL override

  • When you specify an SQL override to read data from a Hive source, ensure that the data type specified in the query is compatible with the data type in the selected source.
  • Specify the query without a semicolon ';' at the end of the SQL statement. If you insert a semi-colon at the end of the SQL statement, the task fails.
  • The SQL override in the source can contain only SELECT queries with Join or Union clauses. If the SQL override contains any other clause such as WHERE and WITH, the mapping fails.

Multiple objects

When the source type selected in a mapping is multiple objects, consider the following restrictions:
  • You cannot override the table name.
  • When you specify an override to the schema, ensure that all the objects selected in the Source transformation belong to the same schema.
  • You cannot join the tables from different databases using the advanced relationship.

Session logs

Ignore log4j initialization warnings in the session logs.

0 COMMENTS

We’d like to hear from you!