Install Python for the Python transformation on Hadoop

Back Next

Steps

This article uses the following method to install Python on each Data Integration Service machine:

Install Python in a directory that we will set to an environment variable called

$PYTHONHOME

Install Python third-party libraries in the same location.

Copy the contents under

$PYTHONHOME

$INFA_HOME/services/shared/spark/python/

where

$INFA_HOME

is the location of the Informatica domain.

Repeat steps 1 to 3 for each Data Integration Service machine.

The final location

$INFA_HOME/services/shared/spark/python/

is the location that the Data Integration Service uses to push the Python installation to the Hadoop cluster nodes where the Spark engine runs the Python transformation.

Install Python for the Python transformation on Hadoop

Install prerequisite packages

Download the Python distribution

Prepare a directory to install Python

Install Python

Set environment variables

Verify the Python installation

Optionally, install third-party libraries

Copy Python to the Informatica domain

Optionally, verify third-party library installations

Integrate Python with the Hadoop connection

Install Jep for streaming mappings

Download Guide

Watch

Comments

Communities

Knowledge Base

Success Portal

0 COMMENTS

We’d like to hear from you! Log in to comment.

Rename Saved Search

Table of Contents

Install Python for the Python transformation on Hadoop

Install Python for the Python transformation on Hadoop

Steps

Steps