OLAP window aggregate functions

Several functions that return aggregate results, such as sums and averages, from the results of a query can also be used as OLAP functions from the context of an OLAP window.

Syntax

(1)
Window aggregate functions

1 
2.1 AVG
2.1 COUNT
2.1 MAX
2.1 MIN
2.1 RANGE
2.1 STDEV
2.1 SUM
2.1 VARIANCE
1 (
2.1 DISTINCT
2.1 UNIQUE
1? 
2.1 table
2.1 alias
2.1 view
2.1 synonym
1 .
1 column_expr
1 ) %OVER clause for aggregation functions1
Element Description Restrictions Syntax
column_expr Column expression argument to the aggregate function See the headings for individual functions below Identifier
alias, synonym, table, view Synonym, table, view, or alias that contains column Synonym and the table or view to which it points must exist Identifier

Usage

These aggregate functions do not require an OLAP window to calculate aggregate values from a query result set. They behave like OLAP window aggregation functions, however, in calling contexts where the OVER clause in the function expression defines one or more window partitions, or includes the window ORDER clause and the window Frame clause.

Important: When the DISTINCT or UNIQUE keyword is part of the window aggregate function specification, the OVER clause of the window aggregate expression cannot include the Window ORDER clause or the Window Frame clause.

The following aggregate functions can return information about the rows in OLAP window partitions.

AVG function

The AVG function returns the average of all values in the specified column or expression in a window partition of the query results, for the rows in each partition that is defined in the OVER clause. If the OVER clause includes the Window Frame clause, AVG returns a value for each set of rows in a window frame.

You can apply the AVG function only to columns of number data types. If you use the DISTINCT (or UNIQUE) keyword, the average (meaning the mean) is calculated from only the distinct values in the specified column or expression, and the OVER clause cannot include a Window ORDER or Window Frame clause.

NULL values are ignored unless every value in the column or expression is NULL. If every value is NULL, the AVG function returns a NULL for that column or expression.

You cannot use the AVG function with a nonnumeric column or expression.

COUNT function

The COUNT function returns the cardinality of non-NULL values in the specified column or expression in a window partition of the query results, for each partition that is defined in the OVER clause. If the OVER clause includes the Window Frame clause, COUNT returns a value for each set of rows in a window frame.

If you use the DISTINCT (or UNIQUE) keyword, the cardinality of rows in the partition is calculated from only the distinct values in the specified column or expression, and the OVER clause cannot include a Window ORDER or Window Frame clause.

NULL values are ignored unless every value in the column or expression is NULL. If every value is NULL, the COUNT function returns a NULL for that column or expression.

MAX function

The MAX function returns the largest value in the column or expression in a window partition of the query results, for the rows in each partition that is defined in the OLAP window OVER clause.

Specifying the DISTINCT or UNIQUE keyword has no effect on the results, but (as with the other built-in aggregate functions) no Window ORDER or Window Frame clause is allowed.

If the OVER clause includes the Window Frame clause, MAX returns a value for each set of rows in a window frame.

When a column expression is specified as argument to COUNT, NULL values are ignored unless every value in the specified column expression is NULL. If every value is NULL, MAX returns a NULL value for that column or expression. When COUNT(*) is specified, NULL values are counted the same as other values.

MIN function

The MIN function returns the lowest value in the column or expression in a window partition of the query results, for each partition that the OLAP window OVER clause defines. If the OVER clause includes the Window Frame clause, the MIN function returns a value for each set of rows.

Specifying the DISTINCT or UNIQUE keyword has no effect on the results, but (as with the other built-in aggregate functions) no Window ORDER or window Frame clause is allowed.

If the OVER clause includes the Window Frame clause, MIN returns a value for each set of rows in a window frame.

NULL values are ignored unless every value in the specified column expression is NULL. If every value is NULL, MIN returns a NULL value for that column expression.

RANGE function

The RANGE function returns the range of values in the column or expression in a window partition of the query results, for each partition that the OLAP window OVER clause defines. If the OVER clause includes the Window Frame clause, RANGE returns a value for each set of rows in a window frame.

The RANGE function calculates the difference between the maximum and the minimum values, as follows:
 range(expr) = max(expr) - min(expr)
You can apply the RANGE function only to numeric column expressions. The following query finds the range of ages for a population:
SELECT RANGE(age) OVER () FROM u_pop;
Because DATE values are stored internally as integers, you can use the RANGE function on DATE column. With a DATE column, the return value is the number of days between the earliest and latest dates in the column expression.

NULL values are ignored unless every value in the column expression is NULL. If every column expression value is NULL, the RANGE function returns a NULL for that column expression.

STDEV function

The STDEV function returns the standard deviation of a column or expression, using the following formula:
SQRT((SUM(Xi2) - (SUM(Xi)2)/N)/(N - 1)) 
In this formula, Xi is each column expression value in the window partition or frame that the OVER clause specifies, and N is the total number of non-NULL values in the column expression.

If the OVER clause includes the Window Frame clause, the STDEV function returns a value for each set of rows in a window frame.

NULL values are ignored, unless every value in the specified column expression is NULL. If every column expression value is NULL, the STDEV function returns NULL for that column expression.

You can apply the STDEV function only to numeric column expressions. You cannot use this function on column expressions of type DATE.

SUM function

The SUM function calculates and returns the sum of all the values of a column expression in a window partition of the query results, for the rows in each partition that is defined in the OLAP window OVER clause.

If you use the DISTINCT (or UNIQUE) keyword as the first item in the argument list, the sum is for only distinct values in the column or expression, and no Window ORDER or Window Frame clause is allowed.

NULL values are ignored, unless every value is NULL. If every value is NULL, the SUM function returns the NULL value for that column or expression.

You cannot use the SUM function with a nonnumeric column or expression.

VARIANCE function

The VARIANCE function calculates and returns the mean squared error as an estimate of the population variance of the values in a specified numeric column or expression for the partitions of query results that are defined in the OVER clause that follows the OLAP window VARIANCE expression.

If the OVER clause includes the Window Frame clause, the VARIANCE function returns a value for each set of rows, using the following formula:
(SUM(Xi2) - (SUM(Xi)2)/N)/(N - 1)
In this formula,
  • Xi is each column value in the window partition or frame that the OVER clause specifies,
  • and N is the total number of non-NULL values in the column (unless all values are NULL, in which case the variance is logically undefined, and the VARIANCE function returns NULL).
You can apply the VARIANCE function only to numeric columns.

Example: AVG function with partitioning

In the following example, the AVG function is used in an OLAP window aggregation expression to return the rolling average closeprice column values during the year 2012 for two window partitions, based on ABC and on XYZ values in the symbol column as the partitioning keys.
   SELECT symbol, tradingdate,
     AVG(closeprice) OVER (PARTITION BY symbol
       ORDER BY tradingdate
     ROWS BETWEEN 29 PRECEDING AND CURRENT ROW)
   FROM dailystockdata
   WHERE symbol IN ('ABC', 'XYZ')
     AND tradingdate BETWEEN '2012-01-01' AND '2012-12-31';
The window ORDER clause specifies the tradingdate column value as the sorting key, and the window Frame clause defines a rolling window that is based on the average of 30 consecutive tradingdate values, ending with the tradingdate of the current row.

Example: AVG function without partitioning

The following query returns the stock price ordered by day and the average of the prices of the current day, the day before, and the day after. The result set is not partitioned because the query does not include a PARTITION BY clause.

SELECT price,  
       AVG(price) OVER (ORDER BY tradingday 
                 ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
FROM stock_price
WHERE tradingday BETWEEN '2012-11-01' AND '2012-11-07';

         price          (avg)
     18.25     18.31
     18.37     18.31
               18.37
               19.03
     19.03     18.81
     18.59     18.61
     18.21     18.40 

The first value in the avg column is the average of the first two values in the price column, because no preceding values exist for the first value of the price column.

The second value in the avg column is the average of the first two values in the price column, because the third row of the price column does not have a value.

The third value in the avg column is equal to the second value in the price column because the third and fourth rows in the price column do not have values.

Example: COUNT function

The following query returns the shipping date, the shipping charge, and the number of orders for each order by customer. The query results are partitioned by customer number and limited to customer numbers that are less than or equal to 110.

SELECT customer_num, ship_date, ship_charge,
       COUNT(*) OVER (PARTITION BY customer_num) 
FROM orders 
WHERE customer_num <= 110;

customer_num ship_date  ship_charge       (count(*)) 

         101 05/26/2008      $15.30                1
         104 05/23/2008      $10.80                4
         104 07/03/2008       $5.00                4
         104 06/01/2008      $10.00                4
         104 07/10/2008      $12.20                4
         106 05/30/2008      $19.20                2
         106 07/03/2008      $12.30                2
         110 07/06/2008      $13.80                2
         110 07/16/2008       $6.30                2 

Customer 104 appears in the list four times. The value in the count column is always 4 for customer 104.