Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Connections
  4. Mappings in the Hadoop Environment
  5. Mapping Objects in the Hadoop Environment
  6. Processing Hierarchical Data on the Spark Engine
  7. Stateful Computing on the Spark Engine
  8. Monitoring Mappings in the Hadoop Environment
  9. Mappings in the Native Environment
  10. Profiles
  11. Native Environment Optimization
  12. Data Type Reference
  13. Complex File Data Object Properties
  14. Function Reference
  15. Parameter Reference

Aggregate Functions as Window Functions

Aggregate Functions as Window Functions

In addition to LEAD and LAG, you can also use aggregate functions as window functions. When you use aggregate functions like SUM and AVG as window functions, you can perform running calculations that are similar to the stateful functions MOVINGSUM, MOVINGAVG, and CUME. Window functions are more flexible than stateful functions because you can set a specific end offset.
To use an aggregate function as a window function, you must define a frame in the windowing properties. You define a frame to limit the scope of the calculation. The aggregate function performs a calculation across the frame and produces a single value for each row.

Example

You are a lumber salesperson who sold different quantities of wood over the past two years. You want to calculate a running total of sales quantities.
The following table lists each sale ID, the date, and the quantity sold:
Sale_ID
Date
Quantity
30001
2016-08-02
10
10001
2016-12-24
10
10005
2016-12-24
30
40001
2017-01-09
40
10006
2017-01-18
10
20001
2017-02-12
20
A SUM function adds all the values and returns one output value. To get a running total for each row, you can define a frame for the function boundaries.
The following image shows the frame you specify on the Windowing tab:
The start offset is all rows preceding and the end offset is 0. The order key is ascending date. The partition key is not specified.
You configure the following windowing properties:
  • Start offset: All Rows Preceding
  • End offset: 0
  • Order Key: Date Ascending
  • Partition Key: Not specified
You define the following aggregate function:
SUM (Quantity)
SUM adds the quantity in the current row to the quantities in all the rows preceding the current row. The function returns a running total for each row.
The following table lists a running sum for each date:
Sale_ID
Date
Quantity
Total
30001
2016-08-02
10
10
10001
2016-12-24
10
20
10005
2016-12-24
30
50
40001
2017-01-09
40
90
10006
2017-01-18
10
100
20001
2017-02-12
20
120


Updated December 13, 2018