Table of Contents

Search

  1. Preface
  2. Starting Data Archive
  3. System Configuration
  4. Database Users and Privileges
  5. Source Connections
  6. Target Connections
  7. Archive Store Configuration
  8. Datatype Mapping
  9. Database Optimization
  10. SAP Application Retirement
  11. z/OS Source Data Retirement
  12. Seamless Data Access
  13. Data Discovery Portal
  14. Security
  15. SSL Communication with Data Vault
  16. LDAP User Authentication
  17. Auditing
  18. Running Jobs from External Applications
  19. Salesforce Archiving Administrator Tasks
  20. Upgrading Oracle History Data
  21. Upgrading PeopleSoft History Data
  22. Data Archive Maintenance
  23. Appendix A: Datetime and Numeric Formatting
  24. Appendix B: Data Archive Connectivity

Administrator Guide

Administrator Guide

Teradata Source Connections

Teradata Source Connections

Define archive source connection properties to connect to applications on Teradata databases.
You can configure the following source connection properties:
Host
IP address of the source application database server.
Database Name
Unique identifier or system identifier for the source application database server.
Admin Schema Name
Default administration database user for the source database server, such as SYSTEM.
The administration database user has DBA rights to the database, including the ability to run DDL and access system-level objects.
Admin Login Name
Login name for the administration database user. This user does not require special permissions as it is only used to connect to the source database. You can provide any user name, such as a database connection user or a read-only user.
Password
Password for the administration login name.
Apps Schema Name
Database user that owns the application tables that you want archive, such as APPS for Oracle applications.
The production application user is either the owner of the tables that you want to archive or partition, or the user that has full rights to the tables that you want to archive or partition. During the archive process, you can use the production application user to delete from the source. Alternatively, you can configure the source connection to use the staging user to delete data from the source.
Use the default database user for the application or provide another value if you do not use the default user. If you provide a user that does not own or has full access to the application tables, then the archive job fails because the job cannot find the table or view.
The archive job uses the number of schemas in the source application to determine schema names. The archive job uses this attribute value for applications that have one schema, such as PeopleSoft and Siebel. The archive job uses the schema name from the mined application for applications that have multiple schemas, such as Oracle applications.
The ILM engine uses this field to generate the SQL SELECT statement to read data from the application tables. For example, if you enter SYSADMIN, the archive job generates the following SQL:
SELECT * from SYSADMIN.<OBJECT NAME>
Application Login Name
Login name that connects to the source database that contains the data you want to archive. This user does not require special permissions as it is only used for the database connection. You can provide any user name, such as a database connection user or a read-only user.
Password
Password for the application login name.
Staging Schema Name
Staging database user for the source database.
The staging database user stores interim tables for business rule validation and generates candidates during the archive and partitioning processes. If you enable staging for the archive source connection, the staging user also creates staging tables. The staging tables temporarily store a copy of the archived data before the archive job moves the data to the archive or history database. The staging user typically resides in the source database.
Staging Login Name
Login name for the staging database.
Password
Password for the staging login name.
Use Staging
Determines whether the archive job temporarily stores data in a staging area on the archive source before it moves data to the target. The archive job includes heavy select and delete operations on the tables that are selected in the archive job. The operations may impact performance on any user operations on the tables. For example, if you run an archive job that archives purchase orders, users may see a performance impact when they create a purchase order. The main benefit of staging is to reduce the number of operations on the production tables.
If enabled, the archive job copies data to the staging area after the archive job evaluates the business rules. The archive job copies data that meets the business rules from the archive source to the staging area. Then, the job deletes data from the source and moves data from the staging area to the target system.
If disabled, the job extracts data directly from the archive source to the archive target. The extraction and deletion occur simultaneously. You may want to disable staging if you do not want to create additional storage space on the production database.
JDBC Fetch Size
Controls the number of rows that the system fetches in one batch from one table if you use JDBC to connect to the source database.
Only change the default value if you receive an out-of-memory error when the archive job copies data to the destination. You may get the out-of-memory error if the source table has large objects in each row, such as CLOBS or BLOBS data types. If you receive this error, then reduce the fetch size.
Default is 1000.
Maximum Parallel Sessions
Determines the maximum amount of parallel session requests that jobs can pass to Teradata Parallel Transporter or the Teradata JDBC FastExport when you export data from Teradata sources. Use the average number of Teradata Access Module Processors (AMPs) to determine the maximum parallel sessions to configure.
Default is 1.
Parallel Entity Run
Determines the interim table names in the entity if you want to run the entity in parallel archive jobs. For example, you may extract data in parallel from two different organizations in the same human resource entity. The system creates interim tables in the staging database user. However, the tables need to be differentiated so the archive jobs know which interim tables are relevant for that job.
By default, the system creates interim table names with the same name in EDM. If you enable this attribute, then the system creates the interim table names with the following syntax:
<table name>_<job ID>
If you enable this attribute and the staging database user does not have authorization to create tables, then the archive job fails.
Default is enabled.
Use Teradata Parallel Transporter
Determines the Teradata native utility that the archive or retirement job uses to export data from Teradata to the Data Vault.
If enabled, the job uses Teradata Parallel Transporter to export data. Teradata Parallel Transporter cannot process tables that include binary and large object data types, such as BLOB and CLOB. The job uses Teradata JDBC FastExport to process tables with binary and large object data types. If the length of all column data types exceeds 8000 characters, the job uses JDBC FastExport.
If disabled, the job uses Teradata JDBC FastExport to export data from all tables. Disable for source connections in restore jobs. Teradata Parallel Transporter is not available for restore.
Teradata Parallel Transporter is not supported in non-bulk mode.
Default is disabled.
Template Name (Teradata Parallel Transporter)
Name of the script template that the archive or retirement job uses to create script files when the job uses Teradata Parallel Transporter for data export. The name is the API package name in the Enterprise Data Manager.
The job uses the script template to create one script for each table the Teradata Parallel Transporter exports data from. During the script creation, the job replaces the parameters with table-specific values. The job uses the script to pass the query request to Teradata Parallel Transporter. Teradata Parallel Transporter rewrites the query requests to optimize the query for maximum performance and throughput.
Edit only if you customized a script template. The default script template is read-only. If you customize the script template, you must copy the script template and modify the copied version. Enter the name of the customized script template.
Default is TPT_EXTRACT_SCRIPT_TEMPLATE.
Compress File (Teradata Parallel Transporter)
Determines if Teradata Parallel Transporter compresses the BCP files that the utility creates.
If enabled, Teradata Parallel Transporter creates compressed, zipped BCP files. You may want to enable the compression if you have a limited amount of staging space for the BCP files. If you enable BCP file compression, the performance of the archive or retirement job may be impacted. Teradata Parallel Transporter creates one BCP file for each table. The time it takes to create the BCP file depends on the size of the data in the table. The compression time increases as the table size increases.
If disabled, Teradata Parallel Transporter creates uncompressed BCP files. Disable to optimize the performance of the archive or retirement job. You may want to disable the compression if you do not have a space limitation for the BCP file staging area.
Default is disabled.
Use Imported Schema Name
The name of the schema when you imported the table metadata from the source.
You must select
Use Imported Schema Name
when the tables in the entity are from more than one schema.
User-defined data types are not supported for Teradata source connections. The PERIOD (TIME WITH TIMEZONE) data type is not supported in the restore functionality.

0 COMMENTS

We’d like to hear from you!