PowerExchange for HBase User Guide

PowerExchange for HBase User Guide

Filtering Source Data

Filtering Source Data

When you configure a mapping that reads data from an HBase source, you can enter a filter expression to filter records read from the source.
You can select the mapping and add the filter expression in the Query tab in the Properties view. You can use any comparison operators in the filter expression.
When you run the mapping, the Data Integration Service filters the source data based on the expressions.
For numeric data types, the Data Integration Service applies the operators for positive values and not for negative values.
When you enter multiple filter expressions, the AND logical operator is applied between the expressions.
If you use the not equal, less than, less than or equal to operators and some columns do not meet the filter condition, the Data Integration Service returns a Null value for these columns. If you use the equal, greater than, greater than or equal to operators and some columns do not meet the filter condition, the Data Integration Service does not return the rows associated with these columns.
Example
The following table lists the columns in the CF column family in an HBase table. There are rows that have c1 and c2 columns, rows that have at least one of the columns, and rows that have neither of the columns.
Row
Column Value
1
column=CF__c1, value=john
1
column=CF__c2, value=jane
2
column=CF__c1, value=jane
3
column=CF__c2, value=jdoe
4
column=CF__c8, value=adam
Create an HBase data object called Name and add it to an HBase mapping. Add the following filter expressions:
Name.CF__C1 = 'john' AND Name.CF__C2 = 'jane'
The Data Integration Service returns the following output because of the equal to operator in the filter expressions. The service does not return rows that contain null values.
ROW: 1
c1: john
c2: jane
However, the output is different when you add the following filter expressions:
Name.CF__C1 != 'john' AND Name.CF__C2 != 'jane'
The Data Integration Service returns rows that contain null values because of the not equal to operator in the filter expression.
Row 2
c1: jane
Row 3:
c2: jdoe
ROW: 4
c1: null
c2: null