Best practices for using cloud storage to archive data in Data Archive

Best practices for using cloud storage to archive data in Data Archive

AWS S3

AWS S3

When you archive data to AWS S3 cloud storage, employ the following best practices:
  • Informatica does not ship the cURL library files with Data Archive. To use a secure HTTPS protocol, download a version of the cURL library that supports SSL.
    Download the following library versions:

      openssl-1.1.1k

      libssh2-1.9.0.tar.gz

      curl-7.74.0

    See the Knowledge Base article for more information.
  • To avoid network latency, use storage buckets that are created or stored in the same region as the client. If you experience communication errors between the Data Vault and external storage, ensure that the network latency is low for Data Vault to perform optimally.
  • On RHEL machines, clear the cache memory periodically and when you perform operations such as running a report or browsing data.
  • Increase the MEMORY and MAXVEM values in the ssa.ini file depending on the size of the sct file on which you run the query. For information, see the Knowledge Base article.
  • Increase the ulimit values on RHEL and SUSE Linux Enterprise machines based on the workload. For example, the number of jobs running in parallel, or the number of open files.
  • When you retire an application with a large volume of data that includes LOB columns, perform extractor, loader, integrated validation, and retention policy in one job on a LOCAL/SAN/NAS/DAS/ network drive storage. When the job completes successfully, use the
    ssamigrate
    utility to migrate the archived data from the LOCAL/SAN/NAS/DAS/network drive storage to the cloud storage.
    See the
    Data Vault Administrator Guide
    for information on how to migrate archived data.
  • You can exclude LOB columns when you run a report. BLOB data appears BLANK and CLOB data is visible to a certain limit in JReports.
  • To avoid limitations on external storage, ensure that the SCT file size does not exceed 5 GB.
  • When you perform operations such as running a report query or browsing data for tables that contain huge data or LOB columns, ensure that no other operations are running in parallel.
  • For information about why a job fails or stops, see the fas logs and the agent logs in the following locations for curl error codes:
    <Data Vault installation directory>\fas_logs
    <Data Vault installation directory>\fas_agent_logs
    See the curl error code documentation to understand why the errors occur and what you can do to fix the issue.

0 COMMENTS

We’d like to hear from you!