Use the Python transformation to execute Python code in a mapping that runs on the Spark engine.
The Python transformation is a passive transformation that provides an interface to define transformation functionality using Python code. You reference the Python code and the resource files that you use in the Python code within the Python transformation.
You can use a Python transformation to implement a machine model on the data that you pass to the transformation. For example, you can use the Python transformation to write Python code that loads a pre-trained model. You can use the pre-trained model to classify input data or create predictions.
Before you can use the Python transformation, you must install Python on the Data Integration Service machine and configure the corresponding Spark advanced properties in the Hadoop connection.
For more information about installing Python, see the