Table of Contents

Search

  1. Preface
  2. Introduction
  3. Defining a System
  4. Flattening IDTs
  5. Link Tables
  6. Loading a System
  7. Static Clustering
  8. Simple Search
  9. Search Performance
  10. Miscellaneous Issues
  11. Limitations
  12. Error Messages

Create_IDT

Create_IDT

The simplest form of IDT is created from column values extracted from a single User Source Table. The general form of the syntax is:
CREATE_IDT IDT-Name SOURCED_FROM [connection_string] source_clause [TRANSFORM transform_clause] [SELECT_BY select_clause] sync_clause ;
where
IDT-Name
is the name of the IDT Table to be created. This must be the same as defined by the
IDT-NAME
parameter in the
IDX-Definition
section and the
PHYSICAL-FILE-NAME
in the
Logical-File-Definition
.

Connection String

The
connection_string
is used to specify connection information for access to the UST on the source database. When using flat file input
connection_string
must be set to
flat_file
.
The format of the
connection_string
is documented in the
OPERATIONS guide, Rulebase/Database Names
section.
When connecting to a source database, the
connection_string’s SystemQualifier
has no meaning, and should be set to an unused number such as 99 to highlight that fact. For example, when connecting to an Oracle source database the
connection_string
is as follows:
odb:99:scott/tiger@oracle920
If a
connection_string
is not supplied, it defaults to the connection details of the database used when initializing the Rulebase, i.e. the value of
SSA_DBNAME
environment variable.
If the source tables
are to be synchronized
, you
must
specify the userid associated with the UST database. This is the same userid that was created and used to install the Update Synchronizer components. Further more, the (default) SSA userid on the UST database must have SELECT privileges on the source tables to be referenced. See the
Installation Guide
for details.
If the source tables
do not require synchronization
, any valid userid with SELECT privileges on the source tables may be used.

Source Clause

This section is used to specify the source of the data.
The
source_clause
is used to nominate the UST columns that are to be extracted when creating the IDT. Source fields can be either added directly into the IDT or used in a transformation process, the result of which is added to the IDT.
The
source_clause
syntax is:
[src_schema.]table.column [(PKn)] [tgt_col [r] [fmt(len)]] [,...]
where
src_schema
is the name of the schema that the
table.column
belongs to. The default value is your userid.
table
the source table name which contains the
column
.
Oracle:
Synonyms (private, public, recursive) may be used but are converted to the real
schema.table
name during parsing.
column
the name of the source column to be extracted.
(PKn)
nominates
column
as the nth component of the primary key.
tgt_col
the name of the column when stored in the IDT. If omitted, it defaults to the value of
column
. If
tgt_col
is prefixed with ’$’, it is treated as a
virtual field
. Virtual fields are not stored in the IDT. They can only be referenced in a
transform_clause
. If
tgt_col
is prefixed with ’$$’, it is treated as a special type of
virtual field
that is read from the source database during Table Loading.
Tgt_col
names prefixed by an underscore (’_’) are reserved for internal use.
[r]
the number of times that
tgt_col
will repeat in the IDT. This parameter is used to declare a repeating field for the flattening process (see the Flattening IDTs). The default value is 1. Column names are generated using the same semantics as the GROUP= clause (described in the
Files and Views
section). Note that the braces are required.
fmt (len)
the format and length of the
tgt_col
in the IDT. When omitted, they default to the format and length of the source column. Valid formats are defined in the File Definition section of this guide.
Note that the default length may be insufficient when the source and target databases use different character sets. Refer to the Globalization section of the
OPERATIONS
manual for details.
When the source is a flat file
src_schema.table.column
must be omitted (as there are no USTs), and
tgt_col
,
fmt
and
len
must be provided to describe the layout of the IDT.

Transform Clause

A
transform_clause
is really an optional part of a
source_clause
or
merge_clause
. It is used to specify how virtual fields are to be combined/transformed and characteristics of the resulting field stored in the IDT.
The
transform_clause
syntax is:
transform_rule tgt_col [r] fmt(len) [order n] [,...]
where
transform_rule
nominates the transformation process which will convert virtual fields into a value to be stored in
tgt_col
. Details appear in the next section.
tgt_col
is the name of the column to be stored in the IDT.
fmt (len)
is the format and length of the
tgt_col
.
[r]
an optional number of times that tgt_col will repeat in the IDT. This parameter is used to declare a repeating field for the flattening process. See the Flattening IDTs section for details.
The default value is 1. Column names are generated using the same semantics as the
GROUP=
clause (described in the
Files & Views
section). Note that the braces are required.
order n
is used to override the default order of fields in the IDT. Normally fields are placed in the IDT in the order of definition in the
source_clause
and
transform_clause
. You may override this order by nominating an integer value for n, starting from 1.

Transform Rules

Transform rules fall into three categories:
  • rules that use no virtual fields
    (Source-Id, Insert-Constant
    )
  • rules that operate on a single virtual field
    (Upper, Lower, Convert-Field
    )
  • rules that operate on many fields
    (Append, Concat, Insert-Field)
Each virtual field can only be referenced once in the
transform_clause
. In the unlikely event that the same source column is to be used in more than one transform clause, add an extra
source_clause
for that column but give it a different virtual field name.
The transform rules use the following format:
Source-Id Insert-Constant ("text") Upper (vf)Lower (vf) Insert-Field (vf, offset, ...) Convert-Field(<Source Field>, <Date Format>) <Target Field> C(<Length in Bytes>) Convert-Field(<Source Field>, <Encoding Format>) <Target Field> C(<Length in Bytes>) Append ( vf | lower (vf) | upper (vf) | "text" , ... ) Concat ( vf | lower (vf) | upper (vf) | "text" , ... )
You can use one of the following transform rules:
Source_Id
Generates a unique ID into
tgt_col
. It should only be used in conjunction with the
AUTO-ID
parameter. The name of the
tgt_col
must match that defined in
AUTO-ID-FIELD
in the
IDX-Definition
section.
Insert-Constant
Injects a string of
text
into
tgt_col
.
Upper
Converts the virtual field
vf
defined in the
source_clause
to uppercase.
Lower
Converts the virtual field
vf
defined in the
source_clause
to lowercase.
Insert-Field
Combines a number of virtual fields into
tgt_col
. Each field is stored at the specified
offset
in
tgt_col
. Offsets start from 0.
Convert-Field
for Date
Converts the date format of the source field to the specified date format for the target field. The rule can also truncate the date based on the length that you specify in bytes. The date format of the source field must be
YYYY-MM-DD
.
The Convert-Field rule uses the following parameters:
  • Source Field
    . Specifies the field whose date format you want to use.
  • Date Format
    . Specifies the date format for the target field.
    Use one of the following values:
    • 0. Retains the same date format as that of the source field.
    • 1. Converts the date format of the source field to the
      DD-MM-YYYY
      format.
    • 2. Converts the date format of the source field to the
      DD-MON-YYYY
      format.
    • 3. Converts the date format of the source field to the
      DD-MON-YY
      format.
    • 4. Converts the date format of the source field to the
      MM-DD-YYYY
      format.
    MON
    indicates the complete name of a month.
  • Target Field
    . Field that uses the converted date format.
  • Length in Bytes
    . Length of the converted date to use. When you use the date format 2, ensure that the length is at least 11 bytes.
The following Convert-Field rule converts the date format of the
$date
field for the
DOB
field:
convert-field($date, 2) DOB C(10)
Convert-Field
for Encoding Format
Converts the encoding format for fields from UTF-8 to UTF-16 and from UTF-16 to UTF-8.
The Convert-Field rule uses the following parameters for the encoding format conversion:
  • Source Field
    . Specifies the field that you want to use for conversion.
  • Encoding Format
    . Specifies the encoding format for the target field.
    Use one of the following formats:
    • convert-field(8)
      . Converts from UTF-16 to UTF-8.
    • convert-field(6)
      . Converts from UTF-8 to UTF-16. Converting to UTF-16 might double the size. The transforms truncate the output if it is not large enough for the converted text.
  • Target Field
    . Field that uses the converted encoding format.
  • Length in Bytes
    . Length of the converted encoding format to use.
Consider the following sample user table:
create_idt IDT547 sourced_from #SSA_DBNAME# #SSA_SCHEMA#.table.PolicyHolderName $field1, #SSA_SCHEMA#.table.Gender $field2, #SSA_SCHEMA#.table.Address $field3, #SSA_SCHEMA#.table.PolicyID (PK1), #SSA_SCHEMA#.table.PINCODE $field4
Use the following format to convert from UTF-16 to UTF-8:
TRANSFORM convert-field($field1, 8) PolicyHolderName C(20), convert-field($field2, 8) Gender C(6), convert-field($field3, 8) Address C(50), convert-field($field4, 8) PINCODE C(10)
Append
Combines the virtual fields and text by stripping trailing spaces from all fields and joining them together into
tgt_col
.
Concat
Functions same as
Append
but adds a single space between the fields when joining them.

Select Clause

Is any valid SQL expression (for a WHERE clause) that can be used to select a
subset of records
from the UST. That is, the
select_by
clause acts a final filter to remove records after selecting / joining rows from the UST. IIR does not parse this expression. It is simply appended to the WHERE clause generated by IIR using an AND logical condition.
SELECT ... WHERE <IIR_expression> AND (<select_clause>)
It is important to ensure that the
select_clause
is syntactically correct. Failure to do so will result in run-time SQL errors in the Table Loader and/or Update Synchronizer.
All columns referenced in the
select_clause
must also appear in the
sourced_from
list (for SYNC systems only). This ensures that they are replicated in the IDT and that correct synchronization will occur.
Do not try to limit the number of rows loaded from the UST using a physical limit. For example,
select_by ROWNUM <= 1000
This approach instructs the SQL Optimizer to return the first 1000 rows but has the disadvantage that the "first 1000 rows" may not be the same ones when the system is loaded a second time. It will produce inconsistent results.
Use a logical limit when selecting a subset of records from the UST. For example,
select_by EmpId >= 50000 AND EmpId <= 51000
This ensures a repeatable set of records.

Sync Clause

The sync clause determines whether or not this IDT will be synchronized with updates to the UST. If it is synchronized an optional Synchronization Level can be specified. The syntax is:
NOSYNC | SYNC [sync_level [txn_source_clause]]
where
sync_level
is either:
REJECT_DUPLICATE_PK
Rejects all duplicates (the default).
REPLACE_DUPLICATE_PK
Replaces duplicates when synchronizing from an NSA Txn-Source (see Txn-Source Clause below).
WARN_DUPLICATE_PK
Produces a warning when a duplicate is added.
ALLOW_DUPLICATE_PK
Allows non-unique PKs.
Refer to the
OPERATIONS guide, Update Synchronizer
chapter for more details about the Synchronization Level.

Txn-Source Clause

Use
txn_source_clause
to specify the method to provide synchronizer transactions. If you do not specify, an identity table created from the user source tables uses trigger-generated transactions and an identity table created from a flat-file uses flat-file transactions. Use the following syntax for
txn_source_clause
:
TXN-SOURCE { TRIGGER | MANUAL | FLAT_FILE | NSA }
where
TRIGGER indicates that the triggers are automatically created on the user source tables to generate transactions in the Synchronizer’s Transaction Table.
MANUAL indicates that the triggers are not created on the user source tables. You have to manually insert transactions in the Synchronizer’s Transaction Table when a user source table is updated.
FLAT-FILE indicates that the synchronizer transactions are provided in a flat-file. Use this option only when you source data from a flat-file.
NSA indicates that the synchronizer transactions are provided in a NSA Transaction Table. You can use this option when you source data from a flat-file.
For example:
CREATE_IDT test-idt SOURCED_FROM odb:99:scott/tiger@oracle8.17 SCOTT.EMP.EMPNO (PK) EmpNo, SCOTT.EMP.ENAME EmployeeName SELECT_BY (scott.emp.empno > 7800) SYNC ;
The example creates the
test-idt
identity table. The data is extracted from an Oracle service named
oracle8.17
by using the userid
scott
whose password is
tiger
. The identity table contains two fields,
EmpNo
and
EmployeeName
. The fields have formats and lengths similar to their corresponding source fields,
EMPNO
and
ENAME
, in the SCOTT.EMP table. Only employees with employee numbers greater than 7800 are extracted and loaded into the IDT.
Any updates made to
SCOTT.EMP
are applied to
test-idt
by the Update Synchronizer by using the default synchronization level of
REJECT_DUPLICATE_PK
. Synchronizer transactions are generated by using triggers on the source table.

Flat_File Input

The syntax for
flat_file
source is similar to the database source, but differs in the following aspects:
  • DB source column names are omitted
  • the layout of the file is specified using the target fields
  • each field must have an explicit format and length definition
  • PKn
    definitions appear before field names
  • virtual fields (used for transforms) must provide format and length information
For example:
CREATE_IDT IDT264 SOURCED_FROM flat_file (pk) RowId W(16), $FirstName W(50), $MiddleName W(50), $LastName W(50) TRANSFORM concat (upper ($LastName), $MiddleName, $FirstName) Name W(150), SYNC REPLACE_DUPLICATE_PK TXN-SOURCE NSA ;

0 COMMENTS

We’d like to hear from you!