Service Group Application Reference

10.5 HotFix 2
- 10.5 HotFix 1
- 10.5
- 10.2 HotFix 1
- 10.2
- 10.1
- 10.0 HotFix 1
- 10.0

Back Next

Parameters

The application program Calls the

NAMESET

Service with the following parameters:

No.	Name	Size (bytes)	Filled in by
1	Service name	8/32	Application
2	Response code	20	NAMESET
3	Function	32, or 2 – 1024	Application
4	Name in	As defined in Algorithm	Application
5	Cleaned Name	Same as Name in	NAMESET
6	Word Stack	258 (by default)	NAMESET
7	Keys Stack	142 (by default)	NAMESET
8	Search Table	677 (by default)	NAMESET
9	Categories	20	NAMESET
10	Work-area	100,000 (minimum)	NAMESET

10 Work-area 100,000 (minimum) NAMESET: The name of the Service for the
NAMESET
Service type as it has been defined in the Algorithm Definition. The name will be either 8 bytes if fixed in length, or up to 32 bytes if variable in length. Refer to the person responsible for defining and customizing the Algorithms.

RESPONSE-CODE (20 bytes): This parameter is filled in by the Service to indicate the success or otherwise of the Call, a value of
00
in the first two positions indicates that all was well, any other value flags a warning or an error. For a description on how to check Response Codes, turn to the
How an Application Should Test the Response Code
section.

FUNCTION (32, or 2-1024 bytes): The
NAMESET
Function is used to control the type of key building or search strategy returned by
NAMESET
to the calling program. The Keys are returned in the Keys-stack parameter. The search strategy is returned as an array of name key search ranges in the Search-table parameter.

Different applications within one organization will typically have different key building and search requirements and will therefore normally use different
NAMESET
Functions. The Function to use needs to be thought about and understood by the application designer because different Functions have different effects on the search performance and quality, and produce search tables which require different processing.

The different types of Keys available are Preferred, Positive or Negative.

The three types of Search Strategies supported are Positive, Negative & Custom. Each can be tailored to provide different emphasis on performance and quality. In all cases, special search ranges called Probes and Secondary Name lookup can also be requested. Following is a description of each of these types of search ranges:

Positive search ranges (otherwise referred to as cascade search ranges) - these search ranges use the search name in the preferred key order. They start with a narrow search and progressively widen. Applications normally process one search range at a time, returning to the user with the results, and then allow the user to choose whether or not to progress to the next, wider, search range.
Negative search ranges - these search ranges are all built for the same depth of search, but are built by permuting the words from the search name into different orders. It is normal to process all negative search ranges before returning to the user with the results or before any decision on the matches are taken.
Secondary name search ranges - are used to invoke secondary name lookup processing. They precede either a positive, negative or Customset Search-table.
Customset search ranges - allow user-defined search ranges and probes to be specified.
Probes - these search ranges are used in different situations to return small sets of records defining candidates which more closely match the search name. They generally precede either a positive or negative Search-table. Probes include Word probes, Word + Initial probes, Customset and Secondary Name probes.

NAMESET

Function is comprised of one or more

NAMESET

Function Keywords, initiated and terminated by an asterisk(*). The maximum length of the total Function specification is 1024 bytes. Here is an example of a valid Function,

*NEG,START=WI*

The WI* means Word + Initial range.

NAMESET

function which contains only the characters ** uses the following default values:

*FINE,CASCADE*

The

NAMESET

Function can also be defined in the Service Group definition as a

NAMESET

Function ’Definition’, and given a name. Refer to the Service Group Definition section of the

DEFINITION and CUSTOMIZATION GUIDE FOR SSA-NAME3 SERVICE GROUPS

for more details. The

NAMESET

Function Definition ’name’ is passed as a parameter instead of the explicit Function keywords. For example, if the Service Group definition contains:


FUNCTIONS-DEFINITION
NEGLGE:NEG,START=WI

Then, when calling the

NAMESET

Service this predefined function name,

NEGLGE

, can be used instead of the explicit keywords. When a Function definition name is used, it must be left justified in a 32 byte field and padded with spaces. Note that no * are used around the Function definition name. The Function parameter can contain a combination of both Function keywords and Function definitions, by use of the

BASE=

keyword. For example,

*BASE=NEGLGE,FULLSEARCH,NOKEYS*

would use both the Function definition keywords specified by

NEGLGE

as well as the keywords

FULLSEARCH

and

NOKEYS.

Function Keywords: NAMESET
Function keywords are described in the Service Group Definition section of the
DEFINITION and CUSTOMIZATION GUIDE FOR SSA-NAME3 SERVICE GROUPS
.

NAME-IN (10-255 bytes): This is the name for which keys or search ranges are to be built. In most cases it should be the full name, however, refer to the Algorithm Definition/Tips on Customizing an Algorithm section in the
DEFINITION and CUSTOMIZATION GUIDE FOR SSA-NAME3 SERVICE GROUPS
, for a better description of the considerations in determining what is a ’name’.

The length of this parameter must be the same as defined in this service’s Algorithm Definition
NAME-LENGTH
parameter.

CLEANED-NAME (10-255 bytes): This is the name after physical cleaning by the Cleaning and Character Set rules defined for this Service’s Algorithm. Refer to the
Nameset
section for more details on the Cleaning process.
CLEANEDNAME
must have the same length as
NAME-IN
.

WORDS-STACK (258 bytes (default)): The Words-stack contains an array of words extracted from the Cleaned Name. It is preceded by a two digit count of the number of words in the stack. The default number of entries in the array is 8. Each entry in the array contains a word, maximum length 24 bytes, and 8 bytes of extra information about that word. The total length of each array entry is therefore 32 bytes. The structure of each stack entry is defined in the following table.

Name
Offset
Size
Description

Word
0
24
The cleaned word.

Word type
24
2

The first byte of this two byte field contains the final word type. The second byte contains the original type before processing. See the table below for a list of Word-types.

Category
26
2
The last Edit-list Category processed for this word.

Original initial
28
1

The initial of the word before any Edit-list processing. For example, if the Edit-list contains a nickname rule to change BILL to WILLIAM, this field will contain a B and the Word will contain WILLIAM.

Word not Stabilized
29
1

A
Y
in this position indicates that the word is defined as a type
O
in the Edit-list. This means that word will not be Stabilized.

Filler
30
2
Not used

The following table contains a list of the valid Word-types for the
NAMESET
Service.

Word-type
Description

B

Suspect Code

– a word with one code character

– a word with one or more ambiguous characters

– a one or two character word preceded or followed by a code word

C

Code

– a single code character (initial)

– a word with 2 or more code characters

I
A single character (Initial)

M
A Major Word

N
A Major Code-word

S
A Skip Word

T
A Skip Code-word

Y
Any other word

blank
An unused entry in the Words-stack.

The number of entries to allow in the Words-stack is controlled by the
Algorithm Definition
parameter:
WORDS-STACK-SIZE=nn
The default value is 8 and thus the default size of this parameter is calculated as follows: (32 x 8) + 2 = 258

It is sometimes necessary to increase the number of entries in the Words-stack when dealing with long names and addresses, otherwise truncation will occur effecting the search & matching quality. The maximum number of entries is 99 and therefore the maximum size is: (32 x 99) + 2 = 3170

Name	Offset	Size	Description
Word	0	24	The cleaned word.
Word type	24	2	The first byte of this two byte field contains the final word type. The second byte contains the original type before processing. See the table below for a list of Word-types.
Category	26	2	The last Edit-list Category processed for this word.
Original initial	28	1	The initial of the word before any Edit-list processing. For example, if the Edit-list contains a nickname rule to change BILL to WILLIAM, this field will contain a B and the Word will contain WILLIAM.
Word not Stabilized	29	1	A Y in this position indicates that the word is defined as a type O in the Edit-list. This means that word will not be Stabilized.
Filler	30	2	Not used

Word-type	Description
B	Suspect Code – a word with one code character – a word with one or more ambiguous characters – a one or two character word preceded or followed by a code word
C	Code – a single code character (initial) – a word with 2 or more code characters
I	A single character (Initial)
M	A Major Word
N	A Major Code-word
S	A Skip Word
T	A Skip Code-word
Y	Any other word
blank	An unused entry in the Words-stack.

KEYS-STACK (142 bytes (default for 5-byte keys))
(202 bytes (default for 8-byte keys)): The Keys-stack contains an array of Name-keys generated from the input name. It is preceded by a two digit count of the number of keys in the stack.

A key-building application will store the keys from the Keys-stack into a database index.

The default key is 5 bytes long and contains the full range of binary values. An alternate key of length 8 bytes and containing only printable characters can be returned in case the database or application language cannot handle the binary keys (the hex value ’00’ is usually the problem). Refer to the
DEFINITION and CUSTOMIZATION GUIDE FOR SSA-NAME3 SERVICE GROUPS, Algorithm Definition
Chapter for details on how to request 8 byte character keys.

The structure of a Keys-stack entry for 5-byte keys is as follows:

Name
Offset
Size
Description

key
0
5
The binary Name-key.

Key type
5
2
The Key Type

The structure of a Keys-stack entry for 8-byte keys is as follows:

Name
Offset
Size
Description

key
0
8
The character Name-key.

Key type
8
2
The Key Type

The Key-type defines the particular encoding mechanism used to build the key. This depends on the mix of common and uncommon words in the name. A common word is a word that exists in the Frequency Table. An uncommon word is one that does not exist in the Frequency Table. For a discussion on the Frequency Table, refer to the
DEFINITION and CUSTOMIZATION GUIDE FOR SSA-NAME3 SERVICE GROUPS
. The following table provides a general description of the Key-type groups.

Key-type
Description

An
Key built from all uncommon words

Bn, Cn
Key built from a mixture of common and uncommon words where the major word is uncommon.

Dn, En, Fn, Gn
Key built from a mixture of common and uncommon words where the major word is common.

Hn
Key built from all common words.

The number of entries to allow in the Keys-stack is controlled by the Algorithm Definition parameter:
KEYS-STACK-SIZE=nn

The default value is 20 and thus the default size of this parameter for 5-byte keys is calculated as follows: (7 x 20) + 2 = 142; It is sometimes necessary to increase the number of entries in the Keys-stack when dealing with long names and addresses, otherwise truncation will occur effecting the search quality. The maximum number of entries is 99 and therefore the maximum size for 5-byte keys is: (7 x 99) + 2 = 695

Name	Offset	Size	Description
key	0	5	The binary Name-key.
Key type	5	2	The Key Type

Name	Offset	Size	Description
key	0	8	The character Name-key.
Key type	8	2	The Key Type

Key-type	Description
An	Key built from all uncommon words
Bn, Cn	Key built from a mixture of common and uncommon words where the major word is uncommon.
Dn, En, Fn, Gn	Key built from a mixture of common and uncommon words where the major word is common.
Hn	Key built from all common words.

SEARCH-TABLE
(677 bytes (default for 5-byte keys))
(806 bytes (default for 8-byte keys): NAMESET
returns in the Search-table an array of key search ranges for the requested Search Strategy.

The Search Strategy was specified in the Function parameter. The Search-table contains either positive search ranges, negative search ranges or probes. Refer to the description of the Function parameter in section
NAMESET
Parameters for more information on these.

A search application will use the key ranges in the Search-table to search a database index which has been previously loaded with SSA-NAME3 Keys.

The Search-table is preceded by a single field, called the Preferred Key. For more information on the Preferred Key, refer to the
DEFINITION and CUSTOMIZATION GUIDE FOR SSA-NAME3 SERVICE GROUPS
.

The Preferred Key is either 5-bytes or 8-bytes long depending on the key length specified in the Algorithm Definition (
SSA-NAME3-OPTIONS #23
).

Immediately following the Preferred Key are the Search-table entries. Each Search-table entry is 32 bytes long. When 8 byte keys are in effect, each entry is 38 bytes long. A Search-table entry is sometimes referred to as a ’depth of search’, a ’level of search’, a ’search range’ or a ’set’.

The contents of a Search-table entry are as follows:

Name
Description

From Key
The Key from which to start a search for this search range.

To Key

The Key from which to end a search for this search range.

From Key and To Key are used, for example, in an SQL Select statement of the form:
SELECT * FROM CUSTOMER-SSA-NAME3-TABLE WHERE SSA-NAME3-KEY >= SSA-NAME3-FROM-KEY AND SSA-NAME3-KEY <= SSA-NAME3-TO-KEY

Depth

The depth of search for this Search-table entry. This field is no longer used and is only present for upward compatibility from SSA-NAME3 versions prior to V1.6.

Scale

The estimated number of records that this search range will find.

The value is returned as two digits of the form ZN, where Z specifies the coarse size of the set (scaler ’10’ is 10, ’20’ is 100, ’30’ is 1000 etc.) and N selects a finer factor (multiplier) from this table:
Z N 0 1.00 1 1.26 2 1.58 3 2.00 4 2.51 5 3.16 6 3.98 7 5.01 8 6.31 9 7.94

To calculate the number of estimated records, use the following formula:
10 into power of Z x Factor(N)

For example, for a Scale of 23:
10 into power of 2 x 2.00 = 200

The Scale covers the range from 0 to about 8,000,000,000 records. It is based on the Key-type and the number of records in the file. It is only useful if the Algorithm contains a Frequency Table built from the names being searched, and the
NAMESET

FILESIZE=
parameter specifies a file size within 10 per cent of the actual size.

Scale is mainly useful to assist making the decision in the application, "is this search range too wide".

Contents

Two digits representing the number of words and initials used to build this search range. The first digit is the number of whole words and the second is the number of characters in the last word (if it was not whole).

For example, a Contents of ’30’ says this search range was built from three whole words in the name. A Contents of ’21’ says this search range was built from two whole words and the initial from the third word.

A special case is when the second digit contains a ’2’. This means that the initial represents only the uncommon words that begin with that initial, and not the common words.

The application can check this field for a value of ’00’, which marks the end of the Search-table.

Key Type

The encoding method used for the keys in this range. Refer to the description of Key-types in the
Keys-stack
parameter above.

Set Id

Sometimes referred to as Range-type, this identifies the type of a search range.

Possible values are:
ID TYPE OF RANGE GENERATED BY B Bad Empty name after Cleaning/Formatting C Cascade Default; FINE; COARSE; WORDS N Negative NEG P Customset CUSTOMSET= 2 Secondary SECONDARY, SECMINOR, SECMAJOR, SECALL W Word probe PROBESWORD, PROBESALL I Word/Initial probe PROBESINIT, PROBESALL S Code probe SSA-NAME3-OPTIONS #6 = C or Y

Sequence

The sets are numbered 00, 01, 02,. . .A break between set numbers occurs when two logically distinct sets of ranges are present. For example, a break occurs between probes and the positive cascade. A break will occur between the ranges generated for different pieces of a Compound- or Account-name.

Filler
Spare area, reserved for future use.

The structure of a Search-table entry for 5-byte and 8-bye keys:

Structure for 5-byte keys

Name
Offset
Size

From Key
0
5

To Key
5
5

Depth
10
2

Scale
12
2

Contents
14
2

Key Type
16
2

Set id
18
1

Sequence
19
2

Filler
21
11

Structure for 8-byte keys

Name
Offset
Size

From Key
0
8

To Key
8
8

Depth
16
2

Scale
18
2

Contents
20
2

Key-type
22
2

Set id
24
1

Sequence
25
2

Filler
27
11

The number of entries in the Search-table is controlled by the Algorithm Definition parameter:
SEARCH-TABLE-SIZE=nn

The default value is 21 and thus the default size of this parameter for 5-byte keys is calculated as follows: (32 x 21) + 5 = 677; The default size for 8-byte keys is calculated as: (38 x 21) + 8 = 806; It is sometimes necessary to increase the number of entries in the Search-table when dealing with long names and addresses and doing negative searches with probes, otherwise truncation will occur effecting the search quality. The maximum number of entries is 99.

Name	Description
From Key	The Key from which to start a search for this search range.
To Key	The Key from which to end a search for this search range. From Key and To Key are used, for example, in an SQL Select statement of the form: SELECT * FROM CUSTOMER-SSA-NAME3-TABLE WHERE SSA-NAME3-KEY >= SSA-NAME3-FROM-KEY AND SSA-NAME3-KEY <= SSA-NAME3-TO-KEY
Depth	The depth of search for this Search-table entry. This field is no longer used and is only present for upward compatibility from SSA-NAME3 versions prior to V1.6.
Scale	The estimated number of records that this search range will find. The value is returned as two digits of the form ZN, where Z specifies the coarse size of the set (scaler ’10’ is 10, ’20’ is 100, ’30’ is 1000 etc.) and N selects a finer factor (multiplier) from this table: Z N 0 1.00 1 1.26 2 1.58 3 2.00 4 2.51 5 3.16 6 3.98 7 5.01 8 6.31 9 7.94 To calculate the number of estimated records, use the following formula: 10 into power of Z x Factor(N) For example, for a Scale of 23: 10 into power of 2 x 2.00 = 200 The Scale covers the range from 0 to about 8,000,000,000 records. It is based on the Key-type and the number of records in the file. It is only useful if the Algorithm contains a Frequency Table built from the names being searched, and the NAMESET FILESIZE= parameter specifies a file size within 10 per cent of the actual size. Scale is mainly useful to assist making the decision in the application, "is this search range too wide".
Contents	Two digits representing the number of words and initials used to build this search range. The first digit is the number of whole words and the second is the number of characters in the last word (if it was not whole). For example, a Contents of ’30’ says this search range was built from three whole words in the name. A Contents of ’21’ says this search range was built from two whole words and the initial from the third word. A special case is when the second digit contains a ’2’. This means that the initial represents only the uncommon words that begin with that initial, and not the common words. The application can check this field for a value of ’00’, which marks the end of the Search-table.
Key Type	The encoding method used for the keys in this range. Refer to the description of Key-types in the Keys-stack parameter above.
Set Id	Sometimes referred to as Range-type, this identifies the type of a search range. Possible values are: ID TYPE OF RANGE GENERATED BY B Bad Empty name after Cleaning/Formatting C Cascade Default; FINE; COARSE; WORDS N Negative NEG P Customset CUSTOMSET= 2 Secondary SECONDARY, SECMINOR, SECMAJOR, SECALL W Word probe PROBESWORD, PROBESALL I Word/Initial probe PROBESINIT, PROBESALL S Code probe SSA-NAME3-OPTIONS #6 = C or Y
Sequence	The sets are numbered 00, 01, 02,. . .A break between set numbers occurs when two logically distinct sets of ranges are present. For example, a break occurs between probes and the positive cascade. A break will occur between the ranges generated for different pieces of a Compound- or Account-name.
Filler	Spare area, reserved for future use.

Structure for 5-byte keys
Name	Offset	Size
From Key	0	5
To Key	5	5
Depth	10	2
Scale	12	2
Contents	14	2
Key Type	16	2
Set id	18	1
Sequence	19	2
Filler	21	11

Structure for 8-byte keys
Name	Offset	Size
From Key	0	8
To Key	8	8
Depth	16	2
Scale	18	2
Contents	20	2
Key-type	22	2
Set id	24	1
Sequence	25	2
Filler	27	11

CATEGORIES (20 bytes): When
NAMESET
is processing a name to build keys or search ranges, each time an Edit-list rule is executed, its category name is added to the Categories list. For more information on Category names, refer to the
DEFINITION and CUSTOMIZATION GUIDE FOR SSA-NAME3 SERVICE GROUPS, Edit List Definition
Chapter.

For example, Categories may contain:
PTPPPR
If the name contained a personal title (PT), a prefix word (PP) and a prefix replace word (PR).

WORK-AREA (30,000 bytes): The Work-area is used by the Service as general purpose scratch-pad memory. For more information on the Work-area, refer to the
Work-area
section.

Rename Saved Search

Table of Contents

Service Group Application Reference

Service Group Application Reference

Parameters

Parameters