Testing the parsing configuration with pattern and name data
Testing the parsing configuration with pattern and name data
The following steps describe the process to test the parse asset with pattern data and associated name data. For example, you might discover names that the Parse transformation did not parse successfully at run time. Import the names and the associated patterns from the transformation output to the asset in Data Quality.
To import the names and associated patterns, copy the data to a CSV or Microsoft Excel file. The pattern values must populate the first column in the file, and the name values must populate the second column.
Open the parse asset that you created for pattern-based parsing.
Select the
Configuration
tab.
Select a runtime environment in which to test the configuration.
Use the
Add User-Defined Patterns
option to import the unparsed name and pattern data that the Parse transformation wrote as output.
You can import up to 5,000 user-defined patterns to a parse asset from a file that you specify.
When you select the file that contains the data, the
Import Patterns
dialog box opens. Select the
Import Input Data
option in the dialog box.
The pattern data appears in the
User-Defined Patterns
pane and the name data appears in the Inputs column of the
Test Results
pane.
Map the values in each pattern to an appropriate field in the pattern grid. For each value, select the field that best matches the type of information that the value represents.
Use the number in each pattern value as a guide when you map the values. The numbers match the order in which the data values appear in the input row. The numbered values in each pattern begin at (0).
Click
Test
.
The output columns display the test results.
Review the results of the test:
Find any input name that the test did not parse. Look for input names in the
Unparsed Data
field.
If you find a name that did not parse, make a copy of the pattern that the test assigned to the name. The test writes the pattern for each name to the
Labeled Name
field.
You can copy the patterns from the Labeled Name field to the file that you previously imported, or you can copy the patterns to a new file. Bear in mind that when you import data to the asset, you erase the prior test and pattern data on the Configuration tab.
Import the file that contains the latest pattern and name data.
Map the pattern values to the appropriate fields in the pattern grid.
Run the test again and review the results.
You may decide to import additional patterns to further update the pattern parsing logic.
When you are satisfied with the performance of the parsing operation on your sample data, save the asset.