Batch interval. The Spark engine processes the streaming data from sources and publishes the data in batches. The batch interval is number of seconds after which a batch is submitted for processing.
Cache refresh interval. You can cache a large lookup source or small lookup tables. When you cache the lookup source, the Data Integration Service queries the lookup cache instead of querying the lookup source for each input row. You can configure the interval for refreshing the cache used in a relational Lookup transformation.
State Store Connection. You can select an external storage connection for the state store. Default external storage connection is HDFS. You can browse the state store connection property to select Amazon S3, Microsoft Azure Data Lake Stroage Gen1, or Microsoft Azure Data Lake Storage Gen2 as the external storage. You can also have a parameterized connection.
Checkpoint Directory. You can specify a checkpoint directory to enable a mapping to start reading data from the point of failure when the mapping fails or from the point in which a cluster is deleted. The directory you specify is created within the directory you specify in the State Store property.