Data Profiling

Back Next

Troubleshooting a data profiling task

Troubleshooting a
data profiling
task

Create and run profiles

The Review Insights option in the menu appears disabled if I open the Results tab before the Insights job is completed. How can I resolve this issue?: To resolve this issue, refresh the page. The
Review Insights
option appears enabled in the menu if insights are generated for the profile.

During profile creation, if I choose an ODBC connection and search for a source object, the search results do not show the source object even when it exists. How can I resolve this issue?: Searches are case-sensitive for ODBC. To search for the source object, enter the source object name using the correct case.
A profile run fails and the following error message appears: Error occurred when initialising tenant - Failed to create a tenant even after 5 attempts.: To resolve this issue, restart the profiling svc nodes and re-run the profile.
A profile run fails and the following error message appears in the session log: "The executor with id xx exited with exit code 137(SIGKILL, possible container OOM).". How do I resolve this issue?: To resolve this issue, perform the following steps:
Open the custom.properties file available in the following location on the machine where the Secure Agent runs:
/root/infaagent/apps/At_Scale_Server/<version>/spark/

Add the following property: spark.executor.memoryOverhead = 2048MB

Save the custom.properties file.
In
Data Profiling
, run the profile.
A profile run fails and the following error message appears in the session log: "The node was low on resource: ephemeral-storage. Container spark-kubernetes-driver was using xxx, which exceeds its request of xx.". How do I resolve this issue?: To resolve this issue, increase the minimum and maximum EBS volume sizes to attach to a worker node for temporary storage during data processing.; To increase the minimum and maximum EBS volume sizes, perform the following steps in Administrator:
In
Administrator
, open the
Advanced Clusters
page.
Select the Advanced Configuration for which you want to change the EBS volume size.
Click
Edit
.
In the
EBS Volume Size
field of the
Platform Configuration
area, increase the values in the
Min GB
and the
Max GB
fields to 200
.
By default, the minimum and maximum volume sizes are 100 GB.

Click
Save
.
Restart the Secure Agent.
In
Data Profiling
, run the profile.
A profile run fails with an internal error when the source object contains a column name with is more than 73 characters.: To resolve this issue, reduce the length of the column name.
Unable to save a profile using Databricks with an ODBC connection when I create tables with the same name under two different databases. How can I resolve this issue?: This issue occurs when you do not specify the schema name in the connection. To resolve this issue, specify the schema name in the connection to point to the correct database.
If columns contain a large number of rows, the profile job fails for a Microsoft Azure Synapse SQL connection and the following error message appears: "error "[FATAL] Exception: com.microsoft.sqlserver.jdbc.SQLServerException: Error 0x27 - Could not allocate tempdb space while transferring data from one distribution to another.". How can I resolve this issue?: To resolve this issue, increase the Data Warehouse Units (DWU) of the Microsoft Azure Synapse SQL instance.
A profile run fails with the error "Profile job failed with error java.lang.RuntimeException: Output Port Primary does not exist in specified rule". How do I resolve this issue?: This error appears when the following conditions are true:
In Data Profiling, you create a profile, add a rule R1, save, and run the profile.
In Data Quality, you modify the rule input or output name for rule specification R1 and save it.
In Data Profiling, you run the profile.; To resolve this issue, you can remove rule R1 from the profile and save the profile. Add the rule R1 again to the profile, save, and run the profile.
A profile run fails with the error "***ERROR: nsort_release_recs() returns -10 ". How do I resolve this issue?: To resolve this issue, increase the disk space storage of the hard drive where Secure Agent is installed.
When you run a profile on an Amazon S3 source object, the profile run fails with an error "Cloud DQ Profiling failure ERROR: Unexpected condition at [file:[..\..\..\common\reposit\trepcnx.cpp|file://[......commonreposittrepcnx.cpp/]] line: [293]". How do I resolve this issue?: To resolve this issue, ensure that you have the valid license for the Amazon S3 connection in Administrator.

When I run a profile on a Salesforce source object, the profile run fails and an 'Out of Memory' error appears. How do I resolve this issue?: To resolve this issue, you can increase the Java heap size -Xmx
value to twice its current value.; To increase the Java heap size, perform the following steps in Administrator:
In
Administrator
, open the
Runtime Environments
page.
Select the Secure Agent for which you want to change the Java heap size.
Click
Edit
.
In the
System Configuration Details
area, select the Data Integration Server
service and choose the DTM
type.
Click
Edit
in the row for the
INFA_MEMORY
property.
Increase the value of Xmx
to twice its current value.
For example, if the current value of
INFA_MEMORY
property is
-Xms256m -Xmx512m
, change it to
-Xms256m -Xmx1024m
.

Click
Save
.
Restart the Secure Agent.
In
Data Profiling
, run the profile.
The profile run fails with an "Out Of Memory" error. How do I resolve this issue?: To resolve this issue, you can increase the Java heap size -Xmx
value to twice its current value.; To increase the Java heap size, perform the following steps in Administrator:
In
Administrator
, open the
Runtime Environments
page.
Select the Secure Agent for which you want to change the Java heap size.
Click
Edit
.
In the
System Configuration Details
area, select the Data Integration Server
service and choose the DTM
type.
Click
Edit
in the row for the
INFA_MEMORY
property.
Increase the value of Xmx
to twice its current value.
For example, if the current value of
INFA_MEMORY
property is
-Xms256m -Xmx512m
, change it to
-Xms256m -Xmx1024m
.

Click
Save
.
Restart the Secure Agent.
In
Data Profiling
, run the profile.
When I run a profile on a Google Big Query source object, the profile run fails and a 'GC overhead limit exceeded' error appears. How do I resolve this issue?: To resolve this issue, you can increase the Java heap size in the JVM options for type DTM. To increase the Java heap size, perform the following steps in Administrator:
In
Administrator
, open the
Runtime Environments
page.
Select the Secure Agent for which you want to change the Java heap size.
Click
Edit
.
In the
System Configuration Details
area, select the Data Integration Server
service and choose the DTM
type.
Click
Edit
in the row for the
INFA_MEMORY
property.
Set the available JVMOption fields to a minimum (
-Xms1024m
) and maximum (
-Xmx4096m
) Java heap size. For example, set JVMOption3 to
-Xms1024m
and JVMOption4 to
-Xmx4096m
.
Click
Save
.
Restart the Secure Agent.
In
Data Profiling
, run the profile.
When I run a profile on a Snowflake Data Cloud source object, the profile job runs with a warning or it fails.: To resolve this issue, you must increase the Java heap size in the JVM options. To increase the Java heap size, perform the following steps in Administrator:
In
Administrator
, open the
Runtime Environments
page.
Select the Secure Agent for which you want to change the Java heap size.
Click
Edit
.
In the
System Configuration Details
area, select the Data Integration Server
service and choose the DTM
type.
Set the available JVM Option fields to a maximum Java heap size value.
If the profile job runs with a warning due to large volumes of data in the source object, set the available JVM Option fields to a maximum Java heap size as per your requirements. For example, JVM Option fields to a maximum (-Xmx2048m
).
If the profile job fails, set the available JVM Option fields to a maximum (
-Xmx2048m
) Java heap size.
For more information, see the following Knowledge Base article.
Click
Save
.
Wait till the Data Integration Server
service restarts.
In
Data Profiling
, run the profile.
Data Profiling rejects the rows that have conversion errors when you run a profile. How do I resolve this issue?: This issue occurs when you edit the column metadata to change the data type of a column that still includes rows with a few values of the previous data type. For example, if the data source includes a column with string and integer values and you change the column data type to integer.

To resolve this issue, you can configure the
Stop on Errors
option and enter the number of rows that include incorrect data type, and then run the profile.
How do I run a profile with Avro and Parquet file format types?: To run a profile with Avro or Parquet file format type, you need to configure the Amazon S3 V2 or Azure Data Lake Store connection with the respective secure agents for the Amazon or Azure cluster.
When I run a profile with Avro or Parquet file format types, the profile run fails and the following error message appears: Columns tab error:[The file or partition directory[] is not valid. The parser encountered the following error while parsing the content:[Only one hadoop distribution can be supported]. Select a valid [Parquet] file or partition directory.] . How do I resolve this issue?: The Cloudera 6.1 package that contains the Informatica Hadoop distribution script and the Informatica Hadoop distribution property files is part of the Secure Agent installation. When you run the Hadoop distribution script, you need to specify the distribution that you want to use. To resolve the above issue, you need to perform the following steps:
Go to the following Secure Agent installation directory where the Informatica Hadoop distribution script is located:
<Secure Agent installation directory>/downloads/package-Cloudera_6_1/package/Scripts

Copy the
Scripts
folder outside the Secure Agent installation directory.
From the terminal, run the
./infadistro.sh
command from the
Scripts
folder and proceed with the prompts.
In
Administrator
, open the
Runtime Environments
page.
Select the Secure Agent for which you want to configure the DTM property and click
Edit
.
Add the following DTM properties in the
Custom Configuration
section:
Service: Data Integration Service
Type: DTM
Name: INFA_HADOOP_DISTRO_NAME
Value: <distribution_version>
The value of the distribution version can be given as CDH_6.1.

Restart the Secure Agent to reflect the changes.
In
Data Profiling
, run the profile.
For more information on the above steps, see
Configure Hive Connector to download the distribution-specific Hive libraries
in Data Integration Connectors help.
When a profile run fails with the following error: "Either the Amazon S3 bucket <xyz> does not exist or the user does not have permission to access the bucket", the Amazon S3 test connection also fails for the same runtime environment:: To resolve the issue, perform the steps listed in the following Knowledge Base article.
When you use the Snowflake ODBC connection to create a profile, the source columns do not load in Data Profiling and the following error message appears:: {"@type":"error","code":"APP_13400","description":"com.informatica.saas.rest.client.spring.RestTemplateExtended$SpringIOException: HTTP POST request failed due to IO error: Read timed out; nested exception is org.springframework.web.client.ResourceAccessException: I/O error on POST request for \" [https://iics-qa-release-pod2-r36-r1-cdi102.infacloudops.net:47813/rest/MetadataRead/getTableMetadata\|https://iics-qa-release-pod2-r36-r1-cdi102.infacloudops.net:47813/rest/MetadataRead/getTableMetadata/] ": Read timed out; nested exception is java.net.SocketTimeoutException: Read timed out","statusCode":403}; To resolve this issue, you must add the CLIENT_METADATA_REQUEST_USE_CONNECTION_CTX=true property in the odbc.ini file located at the $ODBCHOME directory.
Snowflake profiles with large volume like 10 million rows or more fails with the following error: "The target server failed to respond". How do I resolve this issue?: To resolve this issue, perform the following steps:
Create a file with name: logging.properties in the secure agent server at any location, and add the following line in the file, and save the file.

java.util.logging.ConsoleHandler.level=WARNING

In
Administrator
, open the
Runtime Environments
page.
Select the Secure Agent and click
Edit
.
In the
System Configuration Details
area, select the Data Integration Server
service and choose the DTM
type.
Click
Edit Agent Configuration
and add the following value for an empty JVMOption property:
-Xmx6144m

If the Java heap size -Xmx
value is already configured, edit the value of the existing JVMOption property to
-Xmx6144m
.

Click
Edit Agent Configuration
and add the following value for an empty JVMOption property:
-Dnet.snowflake.jdbc.loggerImpl=net.snowflake.client.log.JDK14Logger

Click
Edit Agent Configuration
and add the following value for an empty JVMOption property:
-Djava.util.logging.config.file=<absolute path along with file name created in step 1>

Click
Save
.
Restart the Secure Agent.
In
Data Profiling
, run the profile.
A profile run fails for an Snowflake or Azure Synapse SQL connection and the following error message appears: 'com.informatica.profiling.jpa.model.ProfileableDataSourceColumn; nested exception is org.hibernate.HibernateException: More than one row with the given identifier was found' . How do I resolve this issue?: This issue occurs if the following conditions are true:
You do not specify a schema during the ODBC connection configuration for an Snowflake or Azure Synapse SQL subtype.
There are multiple tables with the same name and columns exist within the different schemas of the connection.

To resolve this issue, you must add a schema in the connection properties to eliminate the duplicate source objects.
A few profile runs fail with the following service exceptions: com.informatica.cloud.errorutil.MicroServiceException: Error parsing results file. com.opencsv.exceptions.CsvMalformedLineException: Unterminated quoted field at end of CSV line and java.sql.SQLException: Parameter index out of range (7 > number of parameters, which is 6).. How do I resolve the issues?: To resolve the issues, you must set the following flag in the
Custom Configuration
section of the Secure Agent: ADD_ESCAPE_CHAR_TO_TARGET=true.; The following image displays the sample configuration details:
When I run a profile on a JSON source object, the profile run fails and the following error message appears: <WorkflowExecutorThread40> SEVERE: The Integration Service failed to execute the mapping. java.lang.RuntimeException: java.lang.RuntimeException: [SPARK_1003] Spark task [InfaSpark0] failed with the following error: [Container [spark-kubernetes-driver] failed with reason [Error] and message [ehaus.janino.CodeContext.flowAnalysis(CodeContext.java:600) ++ at org.codehaus.janino.CodeContext.flowAnalysis(CodeContext.java:600) How do I resolve this issue?: To resolve the issue, perform the following steps:
Stop the Secure Agent and the cluster that is associated with the Secure Agent.
Go to the following Secure Agent custom.properties file directory:
<AgentHome>/apps/At_Scale_Server/<latestversion>/spark

Enter the following values:

spark.driver.extraJavaOptions=-Djava.security.egd=file:/dev/./urandom

-XX:MaxMetaspaceSize=256M -XX:+UseG1GC -XX:MaxGCPauseMillis=500 -Xss75m

spark.executor.extraJavaOptions=-Djava.security.egd=file:/dev/./urandom

-XX:MaxMetaspaceSize=256M -XX:+UseG1GC -XX:MaxGCPauseMillis=500 -Xss75m

spark.driver.memory=10G

spark.executor.memory=12G

Start the Secure Agent.
Re-run the profile.
When you run a profile that includes a mapplet with a Java transformation, the profile fails and the following error message appears: 400 : "{"code":"0","description":"Compilation failed for Java Tx: Java: 500 : \"{\"error\":{\"code\":\"APP_60001\",\"message\":\"Exception occurred during compilation: {\\\"code\\\":\\\"TUNNEL_NOT_FOUND\\\",\\\"message\\\":\\\"No tunnels discovered for... How do I resolve this issue?: Before you create a Mapplet with a Java transformation, perform the following steps:
In
Administrator
, navigate to the
Runtime Environments
page and select
Enable or Disable Services, Connectors
from the
Actions
menu of a Secure Agent or a Secure Agent group.
In the
Enable/Disable Components in Agent Group
window, select
Data Integration - Elastic
.
Click
Save
.

If the issue persists, perform the following steps:
In
Data Integration
, open the mapplet that contains the Java transformation.
Select the Java transformation in the
Design
workspace.
Compile and save the mapplet.
A profile run fails if the tenant initialization intermittently fails for a few orgs and the following error occurs: - java.lang.RuntimeException: java.security.InvalidKeyException: Invalid AES key length: 56 bytes. How do I resolve this issue?: You can re-run the profile if the first profile run in the org fails with the runtime exception error message.
If the runtime environment of the target connection is not up and running, the profile import job fails with an internal error.: Update the target connection details with a runtime environment that is up and running.
After you upgrade Data Profiling from version 2023.08.S to current version, profiles that read data from SAP ERP and SAP HANA source objects fail with the following error message:: com.informatica.imf.io.impl.XMLDeserializerImpl$DeserializeHandler error SEVERE: cvc-complex-type.3.2.2: Attribute 'segregationCategory' is not allowed to appear in element 'adapter:ConnectionAttribute'.; To resolve the issue, perform the following steps:
Open the Secure Agent installation directory
<Secure Agent installation directory>/downloads/<SAP Connector Package>
.
Delete the previous SAP connector package from the downloads folder manually.
Re-run the profile.

Data types and patterns

For which data source does the Data Preview area show True or False for Boolean data type?: Data Profiling shows True and False for Salesforce columns that have the Boolean data type.
Does Data Profiling support all the data types in a Google BigQuery source object?: Data Profiling supports most of the data types in a Google BigQuery source object. The following table lists the known issues for Google BigQuery data types in Data Profiling:

Data types
Known issues

String

When the column precision exceeds 255, Data Profiling truncates the column precision to 255 before the profile run.
Incorrect frequency of null values appear in the

Details

Data Types
section.
When you drill down on null values, blank values also appear in the
Data Preview
area.

Numeric
When the column precision exceeds 28, the profile run fails and the following error appears:
[ERROR] Data Conversion Failed.

Time, Datetime, or Timestamp

Milliseconds do not appear in the profile results.
Profile results contain duplicate values which results in incorrect frequency of values.

Geography
Profile run fails and the following error appears:
[SDK_APP_COM_20000]

Float
When you drill down or create queries, an error appears if the column contains +inf, -inf, or NaN values.
Why do I see a pattern mismatch for INTERVALYEARTOMONTH and INTERVALDAYTOSECOND data types?: This issue occurs because
Data Profiling
reads the INTERVALYEARTOMONTH and INTERVALDAYTOSECOND data types as strings during pattern detection.
Binary float data types appear with extra decimal places. Do I need to do anything to round this off to two decimal places?: This is a known and accepted behavior for the binary float data type in
Data Profiling
. No action is required.

Data types	Known issues
String	When the column precision exceeds 255, Data Profiling truncates the column precision to 255 before the profile run. Incorrect frequency of null values appear in the Details Data Types section. When you drill down on null values, blank values also appear in the Data Preview area.
Numeric	When the column precision exceeds 28, the profile run fails and the following error appears: [ERROR] Data Conversion Failed.
Time, Datetime, or Timestamp	Milliseconds do not appear in the profile results. Profile results contain duplicate values which results in incorrect frequency of values.
Geography	Profile run fails and the following error appears: [SDK_APP_COM_20000]
Float	When you drill down or create queries, an error appears if the column contains +inf, -inf, or NaN values.

Profile results

If the drilldown results contain more than or equal to 100 rows, the Data Preview area does not display all the rows and the following error message appears in the session log: "Transformation Evaluation Error [<<Expression Fatal Error>> [ABORT]: DrillDown limit reached... i:ABORT(u:'DrillDown limit reached')]]". How do I resolve this issue?: If the drilldown results contain more than or equal to 100 rows,
Data Profiling
stops processing the job further and displays the top 100 results in the
Data Preview
area. To resolve this issue and to view the drilldown results of all the rows, you can use the
Queries
option in the
Data Preview
area.

Incorrect profile results appear for data sources that contain UTF-8 characters. How do I resolve this issue?: If the data source contains UTF-8 characters, you can set the
OdbcDataDirectNonWapi
parameter to 0
in Administrator. In
Data Profiling
, create and run the profile on the source object.; To configure the property in Administrator, open the Runtime Environment
page, perform the following steps:
In Administrator, open the
Runtime Environments
page.
Select the Secure Agent for which you want to set this property.
Click Edit
.
In the
System Configuration Details
area, select the Data Integration Server
service and choose DTM
type.
Click Edit
in the row for the
OdbcDataDirectNonWapi
property and set the property to 0.
Click Save
.
Why do I sometimes see no drilldown results for numeric columns?: This issue can occur when the data type is Integer and the column precision is greater than 28.
Data Profiling
does not display drilldown results for Integer data types with column precision greater than 28.
Why do I, sometimes, see incorrect column statistics for numeric and decimal columns that include average, sum, standard deviation, and most frequent values?: This issue can occur when the column precision for numeric columns or decimal columns is greater than 28.
Data Profiling
does not support column precision greater than 28 for numeric columns and decimal columns.
After I upgrade to Spring 2020 July, I do not see the existing query results. Why?: This issue occurs because the previous query results location
$PMCacheDir\profiling\query
is no longer valid. To view the query results, run the query again after you select a flat file connection.
Data Profiling
saves the query results to a file in the directory that you specified for the flat file connection.
After I upgrade to Fall 2020 October, I can still view the drill down results of Spring 2020 July in the Secure Agent Location. How do I clear the drill down results?: To clear the drill down results, open the Secure Agent installation directory
<Agent_installation_dir>/apps/Data_Integration_Server/data/temp/profiling/drilldown
, and then delete the Spring 2020 July drill down results manually.

I see incorrect profile results for columns that include escape characters. How do I resolve this issue?: To resolve this issue, you must set the following flag in the
Custom Configuration
section of the Secure Agent: ADD_ESCAPE_CHAR_TO_TARGET=true.; The following image displays the sample configuration details:
After I import a profile into a folder that contains a profile with the same name, I cannot view the connection and columns details of the profile that I imported on the profile results page. How do I resolve this issue?: This issue occurs when you export from project P1 and import it back into project P1. To resolve this issue, you can must import the profile into a different folder. The profile results appear even if the folder contains a profile with the same name.
When I run a profile with a custom rule, I notice that Data Profiling fetches the expected results. For example, sampling of 1000 Rows, I notice 998 valid and 2 invalid rows in the profiling results. However, when I drilldown on the source object after applying a filter, I notice 998 valid rows and incorrect value for invalid rows in the profiling results. How do I resolve this?: This is an expected behavior. When you run a profile with the FIRST n ROWS sampling option to retrieve 10 rows, you can view 10 rows on the profile results page. However, when you drilldown on the source object,
Data Profiling
retrieves 100 rows instead, ignoring the FIRST N ROWS sampling option.

Rules

Why does profile run take a long time to complete when it contains a Verifier asset as a rule?: This issue occurs when the following conditions are true:
You add the Verifier asset as a rule to the profile and run the profile.
The Secure Agent is configured for a full country license.
The reference data directory in the Secure Agent does not contain address reference data.; When you add the Verifier asset as a rule and run the profile, the Secure Agent downloads the address reference data for the first time which might impact the profile run time. The address reference data is the authoritative data for the postal addresses in the specified country.

Miscellaneous

How do I change the cache directory name in Administrator?: Perform the following steps to edit the cache directory name in Administrator:
In
Administrator
, open the
Runtime Environments
page.
Select the Secure Agent for which you want to change the cache directory name.
Click
Edit
.
In the
System Configuration Details
area, select the Data Integration Server
service and choose DTM
type.
Click
Edit
in the row for the
$PMCacheDir
property.
Remove the whitespaces in the property.
For example, if the property contains
C:\Informatica Cloud Secure Agent\temp\cache
, change it to
C:\InformaticaCloudSecureAgent\temp\cache
.

Click
Save
.
In
Data Profiling
, run the profile.
Why does profile import fail if a profile with the same name exists in the folder?: This issue occurs because
Data Profiling
does not support overwriting of assets during import operation. To resolve this issue, rename the existing profile in the folder and then import the profile.
Why do I see the "ERROR: Document Artifact with Id \u003d jaAqeGnQc6phwrbCWkBW8D not found" error when I delete a profile run? How do I resolve it?: This issue occurs when a profile has an invalid frs ID association. To resolve this issue, you can re-import the profile asset if you have the export file. Or, you can move or copy the imported profile asset to a different folder or project.
Can I change the connection type, source object, and formatting options of a profile job?: Yes, you can edit the connection type, source object, and formatting options of the profile job in the following scenarios:
You can change the connection type with the same connection type.
You cannot change the source object to use a source object of a different connection.
I'm unable to choose a runtime environment for a profile of a flat file connection. Why?: Data Profiling
does not support change in the runtime environment for a profile of a flat file connection. The profile runs on the default runtime environment configured for the flat file connection in Administrator.
How do I configure or override the runtime environment for Avro and Parquet file format types?: You must select a runtime environment that is associated with the advanced configuration.
Where do I find more information about advanced clusters and Informatica encryption for an Amazon S3 V2 connector on an advanced cluster?: For more information about advanced clusters, see the Administrator help.
For more information about Informatica encryption for an Amazon S3 V2 connector on an advanced cluster, see the Configuring Informatica Encryption for Elastic Mappings in Amazon S3 V2 Connector How-to-Library article.

Why do I see the "ERROR: "OPTION_NOT_VALID: OPTION_NOT_VALID Message 000 of class SAIS type E" while importing SAP S/4 HANA source objects? How do I resolve this issue?: Before you import SAP S/4 HANA source objects, you must configure the
SapStrictSql
custom property and set the value based on the SAP system language for the Secure Agent.; For more information, see the Knowledge Base article.
Why are columns not appearing on the Profile Definition page for some of the SAP sources? How do I resolve this issue?: This issue occurs if the source object includes SSTRING, STRING or RAWSTRING data type with precision that is not defined in SAP. To resolve this issue, perform the steps that are specified in the
Rules and guidelines for SSTRING, STRING, and RAWSTRING data types
section of the SAP connector help.
Where do I find more information about troubleshooting SAP Table connection errors?: For more information about troubleshooting SAP Table connection errors, see the
SAP Table connection errors
section in the SAP connector help.
Columns do not appear on the Profile Definition page if the path that I select during the profile creation is not present in the target connection and if the same file name exists in the target connection. How do I resolve this?: You can perform the following steps:
Update the source connection to include the folder path such as
cdqetestbucket-useast2/Finance
.
Create a profile using the source present at the
cdqetestbucket-useast2/Finance
location.
Export and import the profile to the target connection using folder path
cdqetestbucket-uswest2/Finance2
which includes the same source object.
When I import profiles in bulk, the import job fails. How do I resolve this?: You can perform the following steps:
Delete all profiles from the target folder.
Delete all profiles using the following API Calls:
GET API call:
https://na1-dqprofile.dm-us.informaticacloud.com/profiling-service/api/v1/profile/

DELETE API call:
https://na1-dqprofile.dm-us.informaticacloud.com/profiling-service/api/v1/profile/a4181390

For more information, see the
Getting Started with Cloud Data Profiling REST API
documentation.
Verify that profiles are deleted in the target folder. Use the following GET API call:
https://usw3-dqprofile.dm-ap.informaticacloud.com/profiling-service/api/v1/profile
. Ensure that the response does not show any profiles.
Import profiles again.
Verify if profiles have an empty object reference from the zip file then uncheck profiles while importing.
When I run a profile with a rule occurrence, I see that the scorecard job fails with the following error message, but the profile results page displays the results:: I/O error on POST request for "https://dqprofile-intproxy-usw1.infacloudops.net/profiling-service/internal/api/v1/ruleOccurrence/publishResults": Read timed out; nested exception is java.net.SocketTimeoutException: Read timed out

To resolve this issue, make sure that you associate less than or equal to 200 rule occurrences to a profile, and rerun the profile.

OR

If the profiling task fails still, perform the following steps:
Reimport the profile.
Reduce the number of associated rule occurrences to less than or equal to 200.
Rerun the profile.

For optimal performance and safety, the recommendation is to keep the number of rule occurrences that you associate with a profile at 200 or less.
When I run a profile using Databricks, the job fails and the following error message appears:: Error running query: org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 14 tasks (4.3 GiB) is bigger than spark.driver.maxResultSize 4.0 GiB.

To resolve this issue, go to the Data Access Configuration section inside your cluster and increase the
spark.driver.maxResultSize
value to 8 GB or higher.