Name of the column that you want to set as primary key.
Name of the column to store the source information of the data.
You can use only
as the column name.
You can also use the
attribute to specify whether the source information is part of the input data. Set to YES if the source information is part of the input data, and set to NO if the source information is not part of the input data. If you set to NO, ensure that you specify the
Name of the source for the input data. If the input data does not contain the source name, use the
parameter to specify the source name. The source name cannot exceed 32 bytes.
Name of the column on which you want to store the link identifiers.
Size of the column that stores the link identifiers. Use 40 bytes as the column size.
Indicates whether you want to delete the tables that the load job created in the repository when you run the load job again.
Set to true if you want to delete the tables, and set to false if you want to append the data to the existing tables. Default is false.
Specifies the source of the data that you want the MapReduce jobs to use.
The previous example specifies to include data only from Oracle, MySQL, and SAP for the MapReduce jobs to process.
Optional. Indicates whether you want to have a separate table to store the link information.
Set to true if you want to have a separate table, and set to false if you do not want to have a separate table. If the value is false, ensure that you specify the
parameter. Default is false.
Indicates whether you want to add the link number to the record key.
If you set
. Default is false.
Optional. Indicates the total number of records that you can delete at once. Default is 1000.
Optional. Indicates the name of the column based on which you can create a partition identifier and add the partition identifier to the key. Use the following attributes to define the partition identifier:
length. Defines the length of the column that you can add as the prefix to the keys. The maximum length of the column that you can use is 8 bytes. Default is 2 bytes.
part_of_rowkey. Indicates whether the key includes the partition identifier. Set to true if you want to prefix the key with the partition identifier, and set to false if you do not want to prefix the key with the partition identifier.
<PARTITION_COLUMN_NAME length="2" part_of_rowkey="YES">STATE</PARTITION_COLUMN_NAME>
If you configure the
parameter for the initial linking job, you must configure the
parameter when you run other jobs to update or increment the initial data.
Base name for the tables that the initial loading job creates in the repository. The initial loading job uses the following format for the table names:
For example, if you specify
, the initial loading job creates an index table named
, a primary key table named
, and a link table named
Optional. Indicates whether to persist all the columns that you define in the
section in the repository. Configure the
parameter only when you perform advanced matching.
Set to true if you want to persist all the columns in the repository. Set to false if you want to persist only the columns that you use to index data in the repository. Default is false.
If you plan to run the initial linking, initial loading, incremental linking, update linking, or repository data deletion job with the matching rules file, you must set
Name of the column family that groups all the columns in the repository table.
Maximum number of REST requests that you can run concurrently. A higher number improves search performance but uses more memory. You can configure the value based on the amount of available memory. Default is 200.