Developer Transformation Guide

10.5
- 10.5.2
- 10.4.1
- 10.4.0

Back Next

Python Transformation

The Python transformation provides an interface to define transformation functionality using Python code.

Python is a language that uses simple syntax, dynamic typing, and dynamic binding, making Python an ideal choice to increase productivity or to participate in rapid application development. When you use your Python code in a data engineering mapping, the Python code is embedded into the generated Scala code that the Spark or Databricks Spark engine runs to process large, diverse, and fast-changing data sets.

You can also use the Python transformation for machine learning. In the transformation, you can specify a resource file that contains a pre-trained model and load the pre-trained model in the Python code. For example, you can load a pre-trained model to classify input data or to create predictions.

Before you can use the Python transformation, configure the corresponding Spark advanced properties in the Hadoop connection or Databricks connection properties. Then, ensure that the worker nodes on the cluster contain an installation of Python.

For more information about installing Python, see the

Data Engineering Integration Guide

You can only run the Python transformation on the Spark or Databricks Spark engine. You cannot run the Python transformation in the native environment.

Rename Saved Search

Table of Contents

Developer Transformation Guide

Developer Transformation Guide

Python Transformation

Python Transformation