OVER clause for OLAP window expressions

The OVER clause defines the result set on which an OLAP window expression is performed.

Syntax

OVER clause

1 %OVER clause for numbering functions
1 %OVER clause for ranking functions
1 %OVER clause for aggregation functions
OVER clause for numbering functions

1  OVER (? %Window
PARTITION clause? %Window ORDER
clause )
OVER clause for ranking functions

1  OVER (? %Window
PARTITION clause %Window ORDER
clause )
OVER clause for aggregation functions

1  OVER (? %Window
PARTITION clause ?  %Window
ORDER clause? %Window Frame
clause )
Window PARTITION clause

1   PARTITION BY + , partition_key
Window ORDER clause

1  ORDER BY + , sorting_key
2.1! ASC
2.1?  DESC
1?  NULLS
2.1 FIRST
2.1 LAST
Window Frame clause

1 
2.1 RANGE
2.1 ROWS
2.1 
2.2.1  UNBOUNDED PRECEDING
2.2.1  offset PRECEDING
2.2.1 CURRENT ROW
2.1  BETWEEN
2.2.1  UNBOUNDED PRECEDING
2.2.1  offset PRECEDING
2.2.1  offset FOLLOWING
2.2.1 CURRENT ROW
2.1 AND
2.2.1  UNBOUNDED FOLLOWING
2.2.1  offset PRECEDING
2.2.1  offset FOLLOWING
2.2.1 CURRENT ROW
Element Description Restrictions Syntax
offset Unsigned integer that represents the offset from the position of the current row Cannot be negative. If zero, specifies the current row Literal integer.
partition_key Column name, alias, or constant expression by which to partition rows Must be in the select list of the Projection clause Column Expressions
sorting_key Column name, alias, or constant expression by which to sort rows Same restrictions as for partition_key. For RANGE window frames, only a single sorting key is allowed, and the data type must be numeric, DATE, or DATETIME. Column Expressions

If the OVER clause is empty, you must still include the empty parentheses.

Window PARTITION clause

An OLAP window partition is a subset of the rows that are returned by a query. Each partition is defined by one or more column expressions in the PARTITION BY specification of the OVER clause that defines the window. The database server applies the specified OLAP window function to all of the rows in each window partition. If no partitions are defined in the OVER clause, the window function is applied to every row in the result set of the query.

Window ORDER clause

The database server sorts the rows in each window partition according to the sort key (or multiple sort keys) in the window ORDER clause. If you specify no ascending (ASC) or descending (DESC) order, ASC is the default. If no ORDER clause is specified, the order of the qualifying rows is the order in which the rows were retrieved.

Window Frame clause

The window Frame clause can return subsets, called aggregation groups, of the rows in each window partition. A window frame is defined by a specific number of rows or by a range of values.

Row-based window frame
The ROWS keyword creates a row-based window frame that consists of a specific number of rows that precede or follow the current row, or both. The offset represents the number of rows to return. The following example returns seven rows that include the six rows that precede the current row:
AVG(price) OVER (ORDER BY year, day 
   ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
In a row-based window frame clause, offsets are expressed as unsigned integers because the keyword FOLLOWING specifies a positive offset from the current rows, and the keyword PRECEDING specifies a negative offset from the current row. The keyword UNBOUNDED refers to all of the rows from the current row to the limit of the window partition. As the first term after the ROWS keyword in a window Frame specification, UNBOUNDED PRECEDING means that the starting boundary is the first row in the partition, and UNBOUNDED FOLLOWING means that the ending boundary is the last row in the partition.
Value-based window frame
The RANGE keyword creates a value-based frame clause that consists of the current row plus the rows that meet the criteria that is set by the sorting key in the ORDER clause and fit into the specified offset. The offset represents the number of units of the data type of the sorting key. The sorting key must be a numeric, DATE, or DATETIME data type. For example, if the sorting key is a DATE data type, the offset represents a specific number of days. The following example returns the count of the number of rows that have a ship date within 2 days of the current row plus the current row:
COUNT(*) OVER (ORDER BY ship_date
   RANGE BETWEEN 2 PRECEDING AND 2 FOLLOWING)
Value-based window frames define rows within a window partition that contain a specified range of numeric values. The window ORDER clause of the OVER function defines the numeric, DATE, or DATETIME column to which the RANGE specification is applied, relative to the current row value for that column. Only a single sorting key is allowed in the ORDER clause of value-based window frames.

In both the row-based and value-based cases, the OLAP function is calculated on the contents of this window frame, rather than the fixed contents of the whole partition. The window frame does not need to contain the current row. For example, the following specification defines a window frame that contains only the row before the current row:

ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING
If you specify no window ORDER clause for a window aggregation function, then by default, the result set is not restricted, which is equivalent to the following window frame specification:
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
If you specify an ORDER clause but no window frame clause for a window aggregation function, then by default, all rows that precede the current row and the current row are returned, which is equivalent to the following window frame specification:
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

Example: SUM function without a window frame

The following query returns the sales by quarter for one year and the cumulative sum of the sales by quarter.

  SELECT sales, SUM(sales) OVER (ORDER BY quarter) 
     FROM sales WHERE year = 2013
         sales     (sum)
           120       120
           135       255
           127       382
           153       535

The sum of the sales for the fourth quarter is equal to the sales in all four quarters.

Because the query does not include a window frame clause, the SUM function operates on the entire result set, as specified by the FROM clause.

Example: Row-based window frame

The following query returns players that are partitioned by teams and ordered by the number of points. Within each partition, the number of points for the player and the previous player are averaged:

SELECT team, player, points, 
   AVG(points) OVER(PARTITION BY team ORDER BY points
      ROWS 1 PRECEDING AND CURRENT ROW) AS olap_avg
FROM points;

TEAM       PLAYER     POINTS OLAP_AVG      
A          Singh           7  7.00000000000
A          Smith          14 10.50000000000
B          Osaka           8  8.00000000000
B          Ricci          12 10.00000000000
B          Baxter         18 15.00000000000
C          Chun           13 13.00000000000
D          Kwan            9  9.00000000000
D          Tran           16 12.50000000000

Example: Range-based window frame

The following query returns players that are partitioned by teams and ordered by age. Within each partition, the number of points for each player and any player who is up to 9 years older is averaged:

SELECT player, age, team, points,
   AVG(points) OVER(PARTITION BY team ORDER BY age 
      RANGE BETWEEN CURRENT ROW AND 9 FOLLOWING) AS olap_avg
FROM points_age;

PLAYER     AGE TEAM       POINTS OLAP_AVG      
Singh       25 A               7 10.50000000000
Smith       26 A              14 14.00000000000
Baxter      27 B              18 13.00000000000
Osaka       35 B               8 10.00000000000
Ricci       40 B              12 12.00000000000
Chun        21 C              13 13.00000000000
Kwan        22 D               9 12.50000000000
Tran        31 D              16 16.00000000000

In partition A, the average for Singh includes the points for Smith, because Smith is one year older than Singh. The average for Smith does not include the points from Singh, because Singh is younger than Smith.

In partition B, the average for Baxter includes the points for Osaka, who is 8 years older than Baxter, but not for Ricci, who is 13 years older than Baxter.

In partition D, the average for Kwan includes the points for Tran, because Tran is 9 years older than Kwan.

Example: Window frame without the current row

The following query calculates the average number of points for the preceding two rows in the partition:

SELECT player, age, team, points,
   AVG(points) OVER(PARTITION BY team ORDER BY age
   ROWS BETWEEN 2 PRECEDING AND 1 PRECEDING) AS olap_avg
   FROM points_age;
PLAYER     AGE  TEAM   POINTS   OLAP_AVG        
Singh        25 A             7             NULL
Smith        26 A            14    7.00000000000
Baxter       27 B            18             NULL
Osaka        35 B             8   18.00000000000
Ricci        40 B            12   13.00000000000
Chun         21 C            13             NULL
Kwan         22 D             9             NULL
Tran         31 D            16    9.00000000000

In partition B, the average for Ricci is based on the points totals for Baxter and Osaka: (18 + 8 = 26)/2 = 13. When the current row has no preceding rows to use for the calculation, the result is NULL.