Connections

Back Next

Advanced settings

The following table describes the advanced connection properties:

Property	Description
Database	The name of the schema in Databricks. The name can contain only alphanumeric characters and hyphen (-). This property is optional for SQL warehouse, all-purpose cluster, and job cluster. If you do not specify a value, all databases available in the workspace are listed. The value you specify overrides the schema specified in the SQL Warehouse JDBC URL connection property.
JDBC Driver Class Name	The name of the JDBC driver class. This property is optional for SQL warehouse, all-purpose cluster, and job cluster. Default is com.databricks.client.jdbc.Driver
Staging Environment	The staging environment where your data is temporarily stored before processing. This property is required for SQL warehouse, all-purpose cluster, and job cluster. Select one of the following options as the staging environment: AWS. If Databricks is hosted on the AWS platform. Azure. If Databricks is hosted on the Azure platform. Personal Staging Location¹. To stage data in a local personal storage location. If you select Personal Staging Location for a connection that Data Ingestion and Replication uses, the Parquet data files for application ingestion and replication jobs or database ingestion and replication jobs can be staged to a local personal storage location, with a data retention period of 7 days. You must also specify a Database Host value. If you use Unity Catalog, a personal storage location is automatically provisioned. However, you cannot use a personal staging location with Databricks unmanaged tables. Volume¹. To stage data in a volume in Databricks. Volumes are Unity Catalog objects used to manage and secure non-tabular datasets such as files and directories. When you configure a Databricks connection to connect to Databricks endpoints hosted on Google Cloud Platform, you must use a volume to stage data. If you use Data Ingestion and Replication , you can't use unmanaged tables for volume staging. Both volume and personal staging location do not apply to all-purpose clusters and job clusters. Default is Volume. You cannot switch between clusters after you establish a connection. Effective in the October 2024 release, personal staging location is deprecated. Deprecated functionality is supported, but Informatica intends to drop support in a future release. Informatica requests that you use a volume to stage the data.
Volume Path	The absolute path in Volume where you want to stage the data temporarily. Specify the path in the following format: /Volumes/<catalog_identifier>/<schema_identifier>/<volume_identifier>/ If you do not specify a volume path, the Secure Agent creates a managed volume in Databricks.
Databricks Host	The host name of the endpoint the Databricks account belongs to. This property is required only for all-purpose cluster and job cluster. Doesn't apply to SQL warehouse. You can get the Databricks Host from the JDBC URL. The URL is available in the Advanced Options of JDBC or ODBC in the Databricks all-purpose cluster. The following example shows the Databricks Host in JDBC URL: jdbc:databricks://<Databricks Host> :443/ default;transportMode=http; ssl=1;httpPath=sql/ protocolv1/o/<Org Id>/<Cluster ID>; AuthMech=3; UID=token; PWD=<personal-access-token> The value of PWD in Databricks Host, Organization Id, and Cluster ID is always <personal-access-token> .
Cluster ID	The ID of the cluster. This property is required only for all-purpose cluster and job cluster. Doesn't apply to SQL warehouse. You can get the cluster ID from the JDBC URL. The URL is available in the Advanced Options of JDBC or ODBC in the Databricks all-purpose cluster The following example shows the Cluster ID in JDBC URL: jdbc:databricks://<Databricks Host>:443/ default;transportMode=http; ssl=1;httpPath=sql/ protocolv1/o/<Org Id>/<Cluster ID> ; AuthMech=3;UID=token; PWD=<personal-access-token>
Organization ID	The unique organization ID for the workspace in Databricks. This property is required only for all-purpose cluster and job cluster. Doesn't apply to SQL warehouse. You can get the Organization ID from the JDBC URL. The URL is available in the Advanced Options of JDBC or ODBC in the Databricks all-purpose cluster The following example shows the Organization ID in JDBC URL: jdbc:databricks://<Databricks Host>:443/ default;transportMode=http; ssl=1;httpPath=sql/ protocolv1/o/<Organization ID> / <Cluster ID>;AuthMech=3;UID=token; PWD=<personal-access-token>
Min Workers¹	The minimum number of worker nodes to be used for the Spark job. Minimum value is 1. This property is required only for job cluster. Doesn't apply to SQL warehouse and all-purpose cluster.
Max Workers¹	The maximum number of worker nodes to be used for the Spark job. If you don't want to autoscale, set Max Workers = Min Workers or don't set Max Workers. This property is optional for job cluster. Doesn't apply to SQL warehouse and all-purpose cluster.
DB Runtime Version¹	The version of job cluster to spawn when you connect to job cluster to process mappings. This property is required only for job cluster. Doesn't apply to SQL warehouse and all-purpose cluster. Select the Databricks runtime version 9.1 LTS, 13.3 LTS, or 15.4 LTS. To use version 15.4 LTS, ensure that you set the spark.databricks.driver.dbfsLibraryInstallationAllowed parameter to true in the Spark Configuration connection property.
Worker Node Type¹	The worker node instance type that is used to run the Spark job. This property is required only for all-purpose cluster and job cluster. Doesn't apply to SQL warehouse. For example, the worker node type for AWS can be i3.2xlarge. The worker node type for Azure can be Standard_DS3_v2.
Driver Node Type¹	The driver node instance type that is used to collect data from the Spark workers. This property is optional for job cluster. Doesn't apply to SQL warehouse and all-purpose cluster. For example, the driver node type for AWS can be i3.2xlarge. The driver node type for Azure can be Standard_DS3_v2. If you don't specify the driver node type, Databricks uses the value you specify in the worker node type field.
Instance Pool ID¹	The instance pool ID used for the Spark cluster. This property is optional for job cluster. Doesn't apply to SQL warehouse and all-purpose cluster. If you specify the Instance Pool ID to run mappings , the following connection properties are ignored: Driver Node Type EBS Volume Count EBS Volume Type EBS Volume Size Enable Elastic Disk Worker Node Type Zone ID
Elastic Disk¹	Enables the cluster to get additional disk space. This property is optional for job cluster. Doesn't apply to SQL warehouse and all-purpose cluster. Enable this option if the Spark workers are running low on disk space.
Spark Configuration¹	The Spark configuration to use in the job cluster. This property is optional for job cluster. Doesn't apply to SQL warehouse and all-purpose cluster. The configuration must be in the following format: "key1"="value1";"key2"="value2";... For example, "spark.executor.userClassPathFirst"="False" To use Databricks runtime version 15.4, specify the following paramter: 'spark.databricks.driver.dbfsLibraryInstallationAllowed'='true' Doesn't apply to Data Ingestion and Replication tasks.
Spark Environment Variables¹	The environment variables to export before launching the Spark driver and workers. This property is optional for job cluster. Doesn't apply to SQL warehouse and all-purpose cluster. The variables must be in the following format: "key1"="value1";"key2"="value2";... For example, "MY_ENVIRONMENT_VARIABLE"="true" Doesn't apply to Data Ingestion and Replication tasks.
¹Doesn't apply to mappings in advanced mode.