Aggregate functions return summary values for non-null values in selected ports. With aggregate functions you can:
Calculate a single value for all rows in a group.
Return a single value for each group in an Aggregator transformation.
Apply filters to calculate values for specific rows in the selected ports.
Use operators to perform arithmetic within the function.
Calculate two or more aggregate values derived from the same source columns in a single pass.
The transformation language includes the following aggregate functions:
ANY
AVG
COLLECT_LIST
COLLECT_MAP
COUNT
FIRST
LAST
MAX (Date)
MAX (Number)
MAX (String)
MEDIAN
MIN (Date)
MIN (Number)
MIN (String)
PERCENTILE
STDDEV
SUM
VARIANCE
If you configure the
PowerCenter Integration Service
to run in Unicode mode, MIN and MAX return values according to the sort order of the code page you specify in the session properties.
You can use aggregate functions in Aggregator transformations. You can nest only one aggregate function within another aggregate function. The
PowerCenter Integration Service
evaluates the innermost aggregate function expression and uses the result to evaluate the outer aggregate function expression. You can set up an Aggregator transformation that groups by ID and nests two aggregate functions as follows:
SUM( AVG( earnings ) )
where the dataset contains the following values:
ID
EARNINGS
1
32
1
45
1
100
2
65
2
75
2
76
3
21
3
45
3
99
The return value is 186. The
PowerCenter Integration Service
groups by ID, evaluates the AVG expression, and returns three values. Then it adds the values with the SUM function to get the result.
You can also use aggregate functions as window functions in an Expression transformation. To use an aggregate function as a window function when you run a mapping on the Spark engine, you must configure the transformation for windowing. If you use an aggregate function as a window function, the Expression transformation becomes active.