Operations Guide

10.5
- 10.5 HotFix 2
- 10.5 HotFix 1
- 10.2 HotFix 1
- 10.2
- 10.1 HotFix 1
- 10.1
- 10.0 HotFix 1
- 10.0

Back Next

Loading IDTs

The MDM-RE Table Loader generates DBMS load files in delimited text format by default. Specify the

Loader- Definition

option

FIXED

to generate fixed length binary files instead.

To verify that the input data has been read and processed properly, you may specify the

Loader- Definition

option

Keep-Temp

. This will prevent the DBMS loader files from being deleted after the load completes so that they may be examined.

Flat-File Input

If input data is read from a flat-file, make sure that

the file contains fixed length records

FORMAT=Binary

is specified in the

Logical-File-Definition

the format of the input records matches the Input-View

the record length is the sum of the field lengths in the View

MSQ

Data loaded from a flat-file into

CHAR

VARCHAR

columns will be translated from the client’s code page into UNICODE, transferred to the server and then translated into the server’s code page and stored.

The client’s code page should be identical to the server’s code page, otherwise the conversion could be lossy.

If the character set of the data file does not match the client’s code page (Locale) specify the

DATABASEOPTIONS= IDTCP

parameter to specify the code page of the data.

User Source Table Input

During data extraction from a User Source Table’s

CHAR/VARCHAR

columns, the data is translated from the Server’s code page to the client’s code page (Table Loader). The client’s code page should be identical to the server’s code page; otherwise the conversion could be lossy.

Oracle

A safe approach is to use a UNICODE character set for the database and to specify a UNICODE character set for the client (MDM-RE Servers and utilities). MDM-RE automatically requests UNICODE data to be returned as UTF-16 .This ensures that no lossy conversions are performed when reading/writing data to/from the database.

The mass loader file generated by the Table Loader will automatically use a

Fixed

length format when UNICODE columns are present.

MSQ

NCHAR

and

NVARCHAR

columns are not converted. They remain in UNICODE format.

Target Column Size

The format and length of an IDT column defaults to the same values as the source column. In most cases, this is adequate. However, if the source and target databases do not use the same character set, it may be necessary to increase the size of the target column to accommodate the change of encoding.

For example, suppose a source column is defined as CHAR(5) encoded in a Central European character set such as Windows Code Page 1250. Suppose a particular row in the source table contains 5 bytes of data, with four of them being Latin characters (hexadecimal values <0x7F) and one of the characters being the Latin character A with an Acute (= 0xC1 = U+00C1).

When encoded in this character set, the data only requires 5 bytes. However, if it is now stored in an IDT on a database that uses UTF-8 as its character set, the data will be converted. The Latin characters will still only require one byte when expressed as UTF-8 but the

A + Acute

will be encoded using 2 bytes, with the total storage requirement being 6 bytes.

If the default column size is insufficient, use a length override in the definition of the target column to increase its size.

Rename Saved Search

Table of Contents

Operations Guide

Operations Guide

Loading IDTs

Loading IDTs

Flat-File Input

MSQ

User Source Table Input

Oracle

MSQ

Target Column Size