Table of Contents

Search

  1. Preface
  2. Introduction to Transformations
  3. Transformation Ports
  4. Transformation Caches
  5. Address Validator Transformation
  6. Aggregator Transformation
  7. Association Transformation
  8. Bad Record Exception Transformation
  9. Case Converter Transformation
  10. Classifier Transformation
  11. Comparison Transformation
  12. Consolidation Transformation
  13. Data Masking Transformation
  14. Data Processor Transformation
  15. Decision Transformation
  16. Duplicate Record Exception Transformation
  17. Expression Transformation
  18. Filter Transformation
  19. Hierarchical to Relational Transformation
  20. Java Transformation
  21. Java Transformation API Reference
  22. Java Expressions
  23. Joiner Transformation
  24. Key Generator Transformation
  25. Labeler Transformation
  26. Lookup Transformation
  27. Lookup Caches
  28. Dynamic Lookup Cache
  29. Match Transformation
  30. Match Transformations in Field Analysis
  31. Match Transformations in Identity Analysis
  32. Normalizer Transformation
  33. Merge Transformation
  34. Parser Transformation
  35. Python Transformation
  36. Rank Transformation
  37. Read Transformation
  38. Relational to Hierarchical Transformation
  39. REST Web Service Consumer Transformation
  40. Router Transformation
  41. Sequence Generator Transformation
  42. Sorter Transformation
  43. SQL Transformation
  44. Standardizer Transformation
  45. Union Transformation
  46. Update Strategy Transformation
  47. Web Service Consumer Transformation
  48. Parsing Web Service SOAP Messages
  49. Generating Web Service SOAP Messages
  50. Weighted Average Transformation
  51. Window Transformation
  52. Write Transformation
  53. Appendix A: Transformation Delimiters

Developer Transformation Guide

Developer Transformation Guide

Jaro Distance

Jaro Distance

Use the Jaro Distance algorithm to compare two strings when the similarity of the initial characters in the strings is a priority.
The Jaro Distance match score reflects the degree of similarity between the first four characters of both strings and the number of identified character transpositions. The transformation weights the importance of the match between the first four characters by using the value that you enter in the
Penalty
property.

Jaro Distance Properties

When you configure a Jaro Distance algorithm, you can configure the following properties:
Penalty
Determines the match score penalty if the first four characters in two compared strings are not identical. The transformation subtracts the full penalty value for a first-character mismatch. The transformation subtracts fractions of the penalty based on the position of the other mismatched characters. The default penalty value is
0.20
.
Case Sensitive
Determines whether the Jaro Distance algorithm considers character case when it compares characters.

Jaro Distance Example

Consider the following strings:
  • 391859
  • 813995
If you use the default
Penalty
value of
0.20
to analyze these strings, the Jaro Distance algorithm returns a match score of
0.513
. This match score indicates that the strings are 51.3% similar.

0 COMMENTS

We’d like to hear from you!