Read this section to learn what's new for profiling in Enterprise Data Catalog.
Profile Avro files
You can extract metadata, discover Avro partitions, and run profiles on Avro files with multiple-level hierarchy. You can run profiles on Avro files for the following types of resources:
HDFS
Microsoft Azure Data Lake Storage Gen2
Amazon S3
You can run profiles on Avro files using HDFS, Microsoft Azure Data Lake Storage Gen2, and Amazon S3 resources on the Spark engine.
For more information, see the
Informatica Enterprise Data Catalog Administrator Guide
.
Profile Google BigQuery resource with reserved keywords
You can run profiles on the Google BigQuery resource with reserved keywords in the Google BigQuery table.
For more information, see the
Informatica Enterprise Data Catalog Scanner Configuration Guide
.
Use Amazon S3 and AWS Databricks Delta tables to run profiles
You can use Amazon S3 and AWS Databricks Delta tables to run column profiles and discover data domains on both native and AWS Databricks run-time environments.
External PostgreSQL database to improve performance of Similarity Discovery resources
You can configure an external PostgreSQL database to improve the performance of Informatica Similarity Discovery resources. You can choose to either install the PostgreSQL database server bundled with the Enterprise Data Catalog installer or configure an external PostgreSQL database. You can configure the external PostgreSQL database after installation of Enterprise Data Catalog.
For more information, see the KB article
How To: Configure an external PostgreSQL database for similarity profiling