If you want to process data that contains non-ASCII characters, you must integrate the locale setting on Data Engineering Integration with the locale setting on the cluster.
Perform this task in the following situations:
You are integrating for the first time.
To integrate the locale setting, complete the following tasks:
In the Hadoop connection, navigate to
Hadoop Cluster Properties
. As the value for the property
Cluster Environment Variables
, configure the locale environment variables, such as the LANG or LC_ALL environment variable.
The locale setting in the Hadoop connection must match the locale setting that is configured in the domain. To view the locale environment variable values set in the domain, run the following command on any node in the cluster:
locale
In Cloudera Manager, add the environment variables to the following YARN property: