Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Data Engineering Integration
  3. Mappings
  4. Mapping Optimization
  5. Sources
  6. Targets
  7. Transformations
  8. Python Transformation
  9. Data Preview
  10. Cluster Workflows
  11. Profiles
  12. Monitoring
  13. Hierarchical Data Processing
  14. Hierarchical Data Processing Configuration
  15. Hierarchical Data Processing with Schema Changes
  16. Intelligent Structure Models
  17. Blockchain
  18. Stateful Computing
  19. Appendix A: Connections Reference
  20. Appendix B: Data Type Reference
  21. Appendix C: Function Reference

Hive Warehouse Connector and Hive LLAP

Hive Warehouse Connector and Hive LLAP

Enable the Hive Warehouse Connector and Hive LLAP for faster execution of Hive queries when you read from and write to Hive tables. You can use the Hive Warehouse Connector and Hive LLAP with Hortonworks HDP 3.x and Microsoft Azure HDInsight 4.x clusters on the Spark engine.
The Hive Warehouse Connector reads from and writes to Hive tables without using temporary staging tables that require additional storage overhead. Use the Hive Warehouse Connector on the Spark engine to allow Spark code to interact with Hive targets and to use ACID-enabled Hive tables. When you enable the Hive Warehouse Connector, mappings use Hive LLAP to run Hive queries rather than HiveServer2.
Consider the following limitations when you use the Hive Warehouse Connector and Hive LLAP:
  • The Hive Warehouse Connector and Hive LLAP are used to run insert queries to ACID-enabled tables that are not bucketed.
  • You cannot use the Hive Warehouse Connector and Hive LLAP when you read hierarchical data from a source.
  • When you use the Hive Warehouse Connector on Hortonworks HDP clusters, you must use an ORC format target. Data corruption might occur if the target does not use ORC format.
    For more information, see Hortonworks documentation on supported target tables: Apache Hive 3 tables.
  • When you use an external table that has compression properties set, the mapping executes using Spark SQL instead of HiveServer2. The mapping fails if the value of the compression property is not one of the following values: LZO, NONE, SNAPPY, ZLIB.
    The property value is case sensitive and must use upper case.
    For more information, refer to the Apache Hive documentation on compression kind: CompressionKind (Hive 2.1.1 API)
  • When you use choose RETAIN as the target schema strategy, configure the property
    hive.llap.daemon.num.enabled.executors
    on the Hadoop cluster. Set the value of this property to the same value as
    hive.llap.daemon.num.executors
    .
  • When you import a mapping with an ACID-enabled source and target, the Summary Statistics view does not reflect any throughput statistics for the mapping job.

0 COMMENTS

We’d like to hear from you!