You are a data scientist and you want to explore the JupyterLab extension for INFACore to manage data from your JupyterLab environment.
To get started, perform the following tasks after you log in to INFACore:
Step 1. Set up the runtime environment
The runtime environment is the execution platform that runs the INFACore jobs.
First, let's install the agent on the machine that hosts your development environment.
In the
Runtime Environment
section, click
to install the agent on your machine.
The runtime environment downloads an agent locally on your machine and the status displays as up and running, as shown in the following image:
Step 2. Connect to the data source
First, select and configure the data source to which you want to connect. You can select an existing connection or create a new connection to connect to your data source.
In the
Connect to Data Sources
section, click the
Data Source Type
tab, and then select the data source from the list.
You can also search for the data source from the list.
On the
Connections
tab, select an existing connection for the data source from the list, or create a new connection to the data source.
To create a new connection, click the + icon, and then specify the details for the data source that you want to connect to.
For example, if you configure a Snowflake connection, enter a name for the connection, select the authentication method, and enter the Snowflake account details.
The following image shows the properties for a Snowflake connection:
When you create and save a new connection, that connection displays in the connection list.
Select the required connection and perform one of the following actions:
To add the connection to the code cell, select the
icon, and provide a variable name for the connection to display in the Python code.
The selected connection is added to the Python code.
To edit the connection, select the
icon, edit the connection details, and save the connection.
To test the connection, select the
icon.
The Python code for testing the selected connection displays.
Run the code to test if you can connect to Snowflake.
Step 3. Explore the data
After you configure the data source, you can configure functions on your data to perform the following operations:
Read, write, convert to and from the pandas dataFrame.
Parse unstructured or semi-structured data.
Apply prebuilt rules to analyze the data.
On the
General
tab, you can perform the following operations, and click
Submit
:
To read from or to write data, specify the data source, connection, and the data object name.
To convert to or from the Pandas DataFrame, specify a variable name for the Pandas DataFrame.
To apply the parser function on unstructured or semi-structured data, provide a name for the data source, and specify the paths to the sample schema and the input file for the data to which you want to apply the parser function.
On the
Prebuilt Rules
tab, select the required pre-built rule to apply to your data, perform the following tasks, and click
Submit
:
Enter the variable name for the source.
Specify the applicable column name for the field based on the rule you select.
That's it! When you run the code, INFACore performs the configured operations on the data. If you want to check your activity, you can see it on the
Activity Log
page.
To configure any of these operations, you can also directly invoke the INFACore Python SDK. For more information about configuring these operations using the INFACore SDK for python, see the "Read and write end-to-end example" in the "Quickstart" section in the