If the incremental key is an ID, the incremental key column must store numeric data, and the ID must indicate that a new row of data has been added to the source table.
When you run the mass ingestion specification, the Spark engine fetches the rows in the source table with an ID that is higher than the maximum ID value of the rows that have been previously ingested. If the ID value for a row in the table is greater than the maximum ID that has been ingested, the Spark engine fetches the row associated with the ID as incremental data.
For example, you might have ingested the following source table in the previous run of the specification:
EmpID
EmpLastName
481530
'Basquez'
481531
'Savage'
481532
'Greene'
Note that maximum ID value is 481532.
The following table shows the current data in the source table:
EmpID
EmpLastName
481530
'Basquez'
481531
'Savage'
481532
'Greene'
481533
'Caldwell'
481534
'Galloway'
Because the IDs
481533
and
481534
are greater than the maximum ID
481532
that was previously ingested, the rows that are associated with these IDs are incremental data.
When you run the specification, the Spark engine ingests the following rows of data: