Connectors and Connections

Back Next

Advanced settings

The following table describes the advanced connection properties:

Property	Description
Database	The database name that you want to connect to in Databricks Delta. Optional for SQL warehouse and Databricks cluster. For Data Integration, if you do not provide a database name, all databases available in the workspace are listed. The value you provide here overrides the database name provided in the SQL Warehouse JDBC URL connection property.
JDBC Driver Class Name	The name of the JDBC driver class. Optional for SQL warehouse and Databricks cluster. For JDBC URL versions 2.6.22 or earlier, specify the driver class name as com.simba.spark.jdbc.Driver . For JDBC URL versions 2.6.25 or later, specify the driver class name as com.databricks.client.jdbc.Driver .
Staging Environment	The cloud provider where the Databricks cluster is deployed. Required for SQL warehouse and Databricks cluster. Select one of the following options: AWS Azure Personal Staging Location Default is Personal Staging Location. You can select the Personal Staging Location as the staging environment instead of Azure or AWS staging environments to stage data locally for mappings and tasks. If you select Personal Staging Location for a connection that Mass Ingestion uses, the Parquet data files for application ingestion or database ingestion jobs can be staged to a local personal storage location, which has a data retention period of 7 days. You must also specify a Database Host value. If you use Unity Catalog, note that a personal storage location is automatically provisioned. Personal staging location doesn't apply to Databricks cluster. You cannot use personal staging location with Databricks Delta unmanaged tables. You cannot switch between clusters once you establish a connection.
Databricks Host	The host name of the endpoint the Databricks account belongs to. Required for Databricks cluster. Doesn't apply to SQL warehouse. You can get the Databicks Host from the JDBC URL. The URL is available in the Advanced Options of JDBC or ODBC in the Databricks Delta all-purpose cluster. The following example shows the Databicks Host in JDBC URL: jdbc:spark://<Databricks Host> :443/ default;transportMode=http; ssl=1;httpPath=sql/ protocolv1/o/<Org Id>/<Cluster ID>; AuthMech=3; UID=token; PWD=<personal-access-token> The value of PWD in Databricks Host, Organization Id, and Cluster ID is always <personal-access-token> .
Cluster ID	The ID of the cluster. Required for Databricks cluster. Doesn't apply to SQL warehouse. You can get the cluster ID from the JDBC URL. The URL is available in the Advanced Options of JDBC or ODBC in the Databricks Delta all-purpose cluster The following example shows the Cluster ID in JDBC URL: jdbc:spark://<Databricks Host>:443/ default;transportMode=http; ssl=1;httpPath=sql/ protocolv1/o/<Org Id>/<Cluster ID> ; AuthMech=3;UID=token; PWD=<personal-access-token>
Organization ID	The unique organization ID for the workspace in Databricks. Required for Databricks cluster. Doesn't apply to SQL warehouse. You can get the Organization ID from the JDBC URL. The URL is available in the Advanced Options of JDBC or ODBC in the Databricks Delta all-purpose cluster The following example shows the Organization ID in JDBC URL: jdbc:spark://<Databricks Host>:443/ default;transportMode=http; ssl=1;httpPath=sql/ protocolv1/o/<Organization ID> / <Cluster ID>;AuthMech=3;UID=token; PWD=<personal-access-token>
Min Workers	The minimum number of worker nodes to be used for the Spark job. Minimum value is 1. Required for Databricks cluster. Doesn't apply to SQL warehouse.
Max Workers	The maximum number of worker nodes to be used for the Spark job. If you don't want to autoscale, set Max Workers = Min Workers or don't set Max Workers. Optional for Databricks cluster. Doesn't apply to SQL warehouse.
DB Runtime Version	The version of Databricks cluster to spawn when you connect to Databricks cluster to process mappings. Required for Databricks cluster. Doesn't apply to SQL warehouse. Select the Databricks runtime version 9.1 LTS or 13.3 LTS.
Worker Node Type	The worker node instance type that is used to run the Spark job. Required for Databricks cluster. Doesn't apply to SQL warehouse. For example, the worker node type for AWS can be i3.2xlarge. The worker node type for Azure can be Standard_DS3_v2.
Driver Node Type	The driver node instance type that is used to collect data from the Spark workers. Optional for Databricks cluster. Doesn't apply to SQL warehouse. For example, the driver node type for AWS can be i3.2xlarge. The driver node type for Azure can be Standard_DS3_v2. If you don't specify the driver node type, Databricks uses the value you specify in the worker node type field.
Instance Pool ID	The instance pool ID used for the Spark cluster. Optional for Databricks cluster. Doesn't apply to SQL warehouse. If you specify the Instance Pool ID to run mappings , the following connection properties are ignored: Driver Node Type EBS Volume Count EBS Volume Type EBS Volume Size Enable Elastic Disk Worker Node Type Zone ID
Elastic Disk	Enables the cluster to get additional disk space. Optional for Databricks cluster. Doesn't apply to SQL warehouse. Enable this option if the Spark workers are running low on disk space.
Spark Configuration	Doesn't apply to a data loader task or to Mass Ingestion tasks.
Spark Environment Variables	Doesn't apply to a data loader task or to Mass Ingestion tasks.