Use the Jaro Distance algorithm to compare two strings when the similarity of the initial characters in the strings is a priority.
The Jaro Distance match score reflects the degree of similarity between the first four characters of both strings and the number of identified character transpositions. The transformation weights the importance of the match between the first four characters by using the value that you enter in the
Penalty
property.
Jaro Distance Properties
When you configure a Jaro Distance algorithm, you can configure the following properties:
Penalty
Determines the match score penalty if the first four characters in two compared strings are not identical. The transformation subtracts the full penalty value for a first-character mismatch. The transformation subtracts fractions of the penalty based on the position of the other mismatched characters. The default penalty value is
0.20
.
Case Sensitive
Determines whether the Jaro Distance algorithm considers character case when it compares characters.
Jaro Distance Example
Consider the following strings:
391859
813995
If you use the default
Penalty
value of
0.20
to analyze these strings, the Jaro Distance algorithm returns a match score of
0.513
. This match score indicates that the strings are 51.3% similar.