You can configure Amazon Glue as the Hive metastore with an Amazon EMR 5.29 or 6.1 cluster.
Consider the following rules and guidelines:
Glue does not support Hive transactions.
Amazon supports Glue only when the EMR cluster is not Kerberos enabled.
To enable integration with an EMR cluster with Glue, copy .jar files from the cluster to the domain, and then enable the Hive metastore setting in the hive-site.xml configuration before you create the cluster configuration.
Copy .jar files from the cluster to the domain.
Depending on the cluster version, copy the Hive .jar file from the cluster to the domain.
Copy the file from the following directory of the Glue-enabled EMR cluster:
/usr/lib/spark/jars/
Paste the file in the following domain directory on the domain machine:
For EMR 5.29, copy the following file: aws-glue-datacatalog-spark-client-1.11.0.jar
For EMR 6.1, copy the following file: aws-glue-datacatalog-spark-client-3.0.0.jar
When the property hive.metastore.uris is not present in hive-site.xml, add the hive.metastore.uris property with the following value:
thrift://<Hive host name>:<port>
Edit the hive-site.xml file in the cluster configuration .zip archive:
Locate the cluster configuration .zip archive file. For more information about preparing for cluster configuration import, see the Amazon EMR chapter in the
Data Engineering Integration Guide.
Edit the hive-site.xml file in the archive to add the hive.metastore.uris property-value pair.
After you save the changes to the hive-site.xml property, use the cluster configuration .zip archive to create the cluster configuration and Hadoop connection.
For more information about Amazon Glue, see the following Amazon documentation: