Creating a Column Profile on a Semi-structured Data Source
Creating a Column Profile on a Semi-structured Data Source
After you create a flat file data object or complex file data object from Avro, JSON, Parquet, or XML data sources, you can create and run a column profile on the data object.
In the
Object Explorer
view, select the data object for the Avro, JSON, Parquet, or XML file.
Click
File
New
Profile
.
The
New
dialog box appears.
Select
Profile
. Click
Next
.
The
New Profile
dialog box appears.
In the
New Profile
dialog box, add a name for the profile and an optional description.
Select
Process Extended File Formats
option. Click
Next
.
The following image shows the
New Profile
wizard with the
Process Extended File Formats
option:
Process Extended File Formats. Select this option to process semi-structured data sources.
The
Process Extended File Formats
option does not appear for Avro and Parquet data sources when you choose the Resource Format as
Avro
or
Parquet
.
In the
Single Data Object Profile
page, select the columns and options under
Column Selection
and
Data Domain Discovery
as required. Click
Finish
.
If the Developer tool is installed on a Linux machine and the JSON or XML physical data object is a flat file data object with a text file, then perform the following tasks:
On the
Overview
tab, update the
Precision
value to include the number of characters in the file path of the data source in the server.
Update the file path of the data source to the location in the server after you create a profile on the flat file data object. To update the file path, click