Recommendations for IBM DB2

Recommendations for IBM DB2

You can configure the parameters related to the IBM DB2 environment, registry variables, and database file configuration parameters.
The following table lists the recommendations related to the IBM DB2 environment:
Parameter
Recommended Setting
Description
Physical Disk Drives (Tablespace)
Different physical drives for different tablespaces.
To reduce the amount of blocked input/output, you can increase input/output parallelism. Achieve input/output parallelism by storing user data tablespaces, temporary tablespaces, and transaction logs on different physical disk drives. Batch operations can access all the paths in parallel, which increases the throughput by reducing the input wait times and output wait times.
Physical Disk Drives (Container)
Different physical drives for different containers.
If more physical disk drives are available, you can increase input/output parallelism by extending parallelism to the container level. To increase input/output parallelism, place all containers for a tablespace on different physical disks. The IBM DB2 prefetchers and input/output cleaners access these containers in parallel without blocking each other, thereby increasing the throughput.
Processing Large Dataset
When you process a large data set, use the following command to rebind packages:
db2 bind @db2cli.lst blocking all grant public sqlerror continue CLIPKG 10
Perform the step every time you process a large data set. It is better to do so after the initial data load.
Reorganize Match Tables
Routinely determine and re-organize the match key tables (C_<Base Object>_STRP).
To improve the performance of the SearchMatch API, you need to reorganize match key tables based on their primary key column,
SSA_KEY
. To determine if a match key table needs reorganization, perform a reorganization check and analyze the results. The cluster ratio of the primary key index appears in the
CLUSTERRATIO
column, F4, of the reorganization check result. The cluster ratio must be close to 100% for optimal performance.
Determine when to reorganize the match key table by noting the cluster ratio at which you observe degradation in the SearchMatch API performance. Use the IBM
DB2 REORGCHK
and
REORG
commands to reorganize tables.
Update the match key table statistics so that the IBM DB2 optimizer can use the table layout that the reorganization generates.
The following table lists the recommendations for the registry variables:
Database Parameters
Recommended Setting
Description
DB2_INLIST_TO_NLJN
NO
Configures the optimizer to prefer or not prefer nested loop joins.
The DB2 SQL compiler might rewrite an IN list predicate as a join. The rewrite might provide better performance if you define an index on the joined columns. When the optimizer does not have accurate statistics, it might not be able determine the best join for the rewritten join. Set the variable to NO to prevent the optimizer from favoring nested loop joins in such cases.
DB2_ANTIJOIN
YES
Configures the optimizer to transform subqueries into anti-joins. The MDM Hub has queries that use NOT EXISTS subqueries. Set the registry variable to YES to look for possibilities to transform NOT EXISTS subqueries into anti-joins that IBM DB2 processes more efficiently.
DB2_REDUCED_OPTIMIZATION
REDUCE_LOCKING
Configures reduced optimization features or rigid use of optimization features at the specified optimization level. Set the registry variable to
REDUCE_LOCKING
to favor
NLJOIN
over
MSJOIN
whenever possible to reduce the amount of locking on the outer table.
DB2_EXTENDED_OPTIMIZATION
ON
,
ENHANCED_MULTIPLE_DISTINCT
,
IXOR
,
SNHD
Configures whether or not the query optimizer uses optimization extensions to improve query performance.
The
ON
,
ENHANCED_MULTIPLE_DISTINCT
, and
SNHD
values specify different optimization extensions.
DB2_HAS_JOIN
YES
Configures hash join as a possible join method when compiling an access plan.
Tune hash join to get the best performance. For best performance of a hash join, avoid hash loops and overflows to disk.
To tune hash join performance, perform the following tasks:
  • Estimate the maximum amount of memory available for the sheapthres configuration parameter.
  • Tune the sortheap configuration parameter.
The following table lists the recommendations for the database file configuration parameters:
Database Parameters
Recommended Setting
Description
LOCKLIST
AUTOMATIC
Allocates the amount of storage to the lock list of a database.
Multiple MDM Hub processes use "locks at the row-level" to complete tasks and to support concurrency. The number of locks that IBM DB2 needs to acquire depends on the number of rows to process. If the incoming volume differs greatly in size, set the parameter to AUTOMATIC to allow the database manager to determine the appropriate value. If you conservatively tune the
LOCKLIST
parameter value, lock escalations can occur, and some of the MDM Hub operations can fail due to lock timeouts.
MAXLOCKS
AUTOMATIC
Configures the percentage of the lock list that one application can use.
Most MDM Hub processes run under the scope of a single application. Such single applications can acquire many row-level locks, consuming most of the available lock list. If the incoming volume differs greatly in size, predicting the
MAXLOCKS
parameter value is difficult. Set the parameter to AUTOMATIC to allow the database manger to determine the appropriate value. If you conservatively tune the
MAXLOCKS
parameter value, lock escalations can occur, and some of the MDM Hub operations can fail due to lock timeouts.
CATALOGCACHE_SZ
25000 or higher
Configures the maximum memory that the catalog cache can use from the shared memory of the database.
IBM DB2 stores system catalog information in the catalog cache. The MDM Hub comprises many dynamic SQL queries that reference multiple metadata objects. If the catalog cache is large, IBM DB2 can retain information for some of the metadata objects from the system catalogs in the memory. If subsequent dynamic SQL queries require the same metadata objects, the compilation is quick. The MDM Hub comprises many frequently accessed metadata objects. Therefore, you must set the
CATALOGCACHE_SZ
parameter value to 25000 or higher.
LOGBUFSZ
4096 or higher
Configures the amount of the database heap to use as a buffer for log records before writing the records to disk.
The MDM Hub creates logs for most of its operations in the IBM DB2 transaction logs. IBM DB2 buffers the log records in the log buffer before writing them to the disk.
If the log buffer is large, IBM DB2 writes the log records to the disk less frequently. This makes disk input/output for log records more efficient. The default value for the database configuration parameter is not sufficient for an average MDM Hub environment. Set the parameter to 4096 pages or higher.
LOGFILSIZ
128000 or higher
Configures the number of log records written to the log files.
A single MDM Hub transaction can contain many DML queries resulting in many log records that might span across many log files. A large log file size avoids the need to create new log files frequently. If IBM DB2 creates new log files frequently, it adversely influences the performance of input/output bound systems.
The total log space for a database is equal to the total number of primary and secondary log files multiplied by the log file size. The database must have adequate log space to ensure that the MDM Hub transactions do not run out of log space and fail. If the MDM Hub transactions fail, the database needs more time to roll back the transactions. Set
LOGFILSIZ
to 128000 or higher to ensure that the MDM Hub transactions do not fail and need a roll back. Also, you must consider the number of primary and secondary logs for a complete equation of log space.
LOGPRIMARY
100
Configures the number of primary log files to be pre-allocated. IBM DB2 creates primary log files when you activate the database. If the uncommitted transaction exhausts the primary log space, IBM DB2 creates secondary log files as needed. Set the
LOGPRIMARY
parameter value to 100 to efficiently handle the MDM Hub processes. Secondary log files act as a backup in cases where long transactions can exhaust the entire primary log space.
LOGSECOND
100
Configures the number of secondary log files that IBM DB2 can create and use for recovery log files.
Log file creation can adversely impact the performance based on the size of the log file. If you allocate sufficient primary log space, transaction performance increases because the database does not create secondary log files frequently. Set the
LOGSECOND
parameter value to 100 to cover unexpected long transactions due to large incoming volumes, especially during batch processes. The sum of the values of
LOGPRIMARY
and
LOGSECOND
must be 200.
PCKCACHESZ
128000
Configures the package cache size which is allocated out of the database shared memory.
The MDM Hub has many dynamic SQL statements. Each dynamic SQL statement has a compiled package associated with it. IBM DB2 caches these packages in the package cache memory. You must configure an appropriate package cache size to avoid package cache overflows, which adversely influence performance.
Experiment with the values for the package cache size. Initially, set the parameter value to 50000 and monitor the different phases of the MDM Hub processes. If you observe frequent package overflows, tune the parameter again.
STMHEAP
AUTOMATIC
Configures the limit of the statement heap, which is used during the compilation of an SQL statement.
If the statement heap is not sufficient, it might prevent the optimizer from evaluating all possible access plans for an SQL query. This might result in a suboptimal plan and adversely influence performance. Set the
STMTHEAP
parameter to automatic to allow the optimizer to weigh all possible access plans for the compilation an SQL query.
SORTHEAP
AUTOMATIC
Configures the sort heap size.
The MDM Hub processes perform many sorts. If the sort heap size is not sufficient, large sorts can spill from the memory to disk. Disk input/output is slower compared to memory, and such sort spills can cause queries to run longer. Sort spills to disk can adversely influence performance and is more evident with larger spills. Set the parameter value to AUTOMATIC for the memory tuner to dynamically size the memory area as the sort requirements change.
SHEAPTHRES_SHR
AUTOMATIC
Configures the limit on the total amount of database shared memory that the sort memory consumers can use at a time. Set the
SHEAPTHRES_SHR
parameter to AUTOMATIC if you set the
SORTHEAP
parameter to AUTOMATIC.
UTIL_HEAP_SZ
50000 or higher
Configures the maximum amount of memory that the BACKUP, RESTORE, and LOAD utilities can use simultaneously. During some batch operations, the MDM Hub uses the IBM DB2 LOAD utility to move data between tables. The LOAD utility uses the utility heap to complete the data movement process. The size of the utility heap has an impact on the performance of the LOAD operation. Set the
UTIL_HEAP_SZ
parameter to an appropriate value to provide better throughput for the MDM Hub processes.

0 COMMENTS

We’d like to hear from you!