Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Mappings in the Hadoop Environment
  4. Mapping Sources in the Hadoop Environment
  5. Mapping Targets in the Hadoop Environment
  6. Mapping Transformations in the Hadoop Environment
  7. Processing Hierarchical Data on the Spark Engine
  8. Configuring Transformations to Process Hierarchical Data
  9. Processing Unstructured and Semi-structured Data with an Intelligent Structure Model
  10. Stateful Computing on the Spark Engine
  11. Monitoring Mappings in the Hadoop Environment
  12. Mappings in the Native Environment
  13. Profiles
  14. Native Environment Optimization
  15. Cluster Workflows
  16. Connections
  17. Data Type Reference
  18. Function Reference
  19. Parameter Reference

Big Data Management User Guide

Big Data Management User Guide

Aggregate Functions as Window Functions

Aggregate Functions as Window Functions

In addition to LEAD and LAG, you can also use aggregate functions as window functions. When you use aggregate functions like SUM and AVG as window functions, you can perform running calculations that are similar to the stateful functions MOVINGSUM, MOVINGAVG, and CUME. Window functions are more flexible than stateful functions because you can set a specific end offset.
To use an aggregate function as a window function, you must define a frame in the windowing properties. You define a frame to limit the scope of the calculation. The aggregate function performs a calculation across the frame and produces a single value for each row.

Example

You are a lumber salesperson who sold different quantities of wood over the past two years. You want to calculate a running total of sales quantities.
The following table lists each sale ID, the date, and the quantity sold:
Sale_ID
Date
Quantity
30001
2016-08-02
10
10001
2016-12-24
10
10005
2016-12-24
30
40001
2017-01-09
40
10006
2017-01-18
10
20001
2017-02-12
20
A SUM function adds all the values and returns one output value. To get a running total for each row, you can define a frame for the function boundaries.
The following image shows the frame you specify on the Windowing tab:
The start offset is all rows preceding and the end offset is 0. The order key is ascending date. The partition key is not specified.
You configure the following windowing properties:
  • Start offset: All Rows Preceding
  • End offset: 0
  • Order Key: Date Ascending
  • Partition Key: Not specified
You define the following aggregate function:
SUM (Quantity)
SUM adds the quantity in the current row to the quantities in all the rows preceding the current row. The function returns a running total for each row.
The following table lists a running sum for each date:
Sale_ID
Date
Quantity
Total
30001
2016-08-02
10
10
10001
2016-12-24
10
20
10005
2016-12-24
30
50
40001
2017-01-09
40
90
10006
2017-01-18
10
100
20001
2017-02-12
20
120

0 COMMENTS

We’d like to hear from you!