Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Ask INFA.

JupyterLab Extension for INFACore

JupyterLab Extension for INFACore

Configure the structure parser function

Configure the structure parser function

INFACore parses unstructured or semi-structured data using the Intelligent Structure Discovery (ISD) jars which is bundled with the INFACore installation.
To parse data, select the
Parse Unstructured Data
function, and specify the following fields:
  • New DataFrame Name
    : Specify a name for the new DataFrame. A DataFrame is a two-dimensional data structure, where data is aligned in a tabular fashion in rows and columns.
  • Schema file path
    . Specify the file path to the sample schema file.
  • Input file path
    . Specify the input file path of you source data that contains unstructured data.
Example
The following image is a snapshot of the unstructured data in JSON format in the
json_input.json
file that you want parse:
The input file contains data in unstructured format.
Provide the path to the sample schema
sample_schema.txt
file that you want INFACore to refer to parse the unstructured data:
 You can view the sample schema file.
See the sample Python code that displays when you apply the parser function with the input file and sample schema file:
import informatica.infacore as ic pf = ic.ParserFunctions() parser_data = pf.parse_unstructured_data("C:\\Users\\John\\Documents\\FF_SOURCES\\json_input.json", "C:\\Users\\John\\Documents\\FF_SOURCES\\sample_schema.txt")
To apply the Pandas function, invoke the Python SDK to convert the INFACore DataFrame to the Pandas DataFrame and return the rows:
df_reader = ic.DataFrameReader(parser_data) p_df = df_reader.to_pandas() p_df.head()
For more information, see the
INFACore SDK Reference for Python
.
When you run the code, the structure parser function returns data in a structured format:
State
Account Length
Area Code
Phone
Int'l Plan
VMail Plan
VMail Message
token
Mins
Calls
Charge
CustServ Calls
Churn
PA
163
806
403-2562
no
yes
300
Day
8.162204
3
7.579174
3
True.
PA
163
806
403-2562
no
yes
300
Eve
3.933035
4
6.508639
3
True.
PA
163
806
403-2562
no
yes
300
Night
4.065759
100
5.111624
3
True.
PA
163
806
403-2562
no
yes
300
Intl
4.92816
6
5.673203
3
True.
SC
15
836
158-8416
yes
no
0
Day
10.018993
4
4.226289
8
False.

0 COMMENTS

We’d like to hear from you!