The Classifier transformation is a passive transformation that analyzes input fields and identifies the type of information in each field. Use a Classifier transformation when input fields contain multiple text values.
When you configure the Classifier transformation, you select a classifier model and a classifier algorithm. A classifier model is a type of reference data object. A classifier algorithm is a set of rules that calculates the number of similar words in a string and the relative positions of the words. The transformation compares the algorithm analysis with the content of the classifier model. The transformation returns the model classification that identifies the dominant type of information in the string.
The Classifier transformation can analyze strings of significant length. For example, you can use the transformation to classify the contents of email messages, social media messages, and document text. You pass the contents of each document or message to a field in a data source column, and you connect the column to a Classifier transformation. In each case, you prepare the data source so that each field contains the complete contents of a document or string you want to analyze.