The Python code defines how the Python transformation processes data. When you write Python code, you might reconstruct input variables, load a pre-trained model, define output variables, and define additional transformation functionality.
To write the Python code, use the following tabs:
Pre-Input
On the Pre-Input tab, define code that can be interpreted once and shared between all rows of data.
Use the Pre-Input tab to perform the following tasks:
Declare import statements.
Declare variables.
Initialize variables.
Define helper methods.
On Input
On the On Input tab, define how the Python transformation behaves when it receives an input row while processing a partition. The Python transformation processes the code on the On Input tab for each partition and each row in the partition.
At End
On the At End tab, define how the Python transformation behaves after it processes all input data in a partition. You can also call the generateRow() method to generate output rows for the partition.
Guidelines for Writing Python Code
Use the following rules to write Python code:
Define variables before you use them. For example, you cannot reference a variable on the Pre-Input tab if the variable is defined on the On Input tab.
To access input ports, call the input port name.
To set output ports, set the output port to a value. You must set the output port to a value for each output port that you define in the Python transformation.
To define how the transformation writes data from the input ports to output ports, set the output port to the value of the input port.
For example, write
output_port = input_port
to write the data from the input port
input_port
to the output port
output_port
.
To access resource files, use the variable
resourceFilesArray
. Specify the resource file using an index such as
resourceFilesArray[0]
.
When you run the Python transformation, the Data Integration Service does not validate the Python code.