The GROUP BY clause

The GROUP BY clause divides a table into sets. This clause is most often combined with aggregate functions that produce summary values for each of those sets. Some examples in Compose SELECT statements show the use of aggregate functions applied to a whole table. This section illustrates aggregate functions applied to groups of rows.

Using the GROUP BY clause without aggregates is much like using the DISTINCT (or UNIQUE) keyword in the SELECT clause. The following query is described in Select specific columns.
Figure 1: Query
SELECT DISTINCT customer_num FROM orders;
You could also write the statement as the following query shows.
Figure 2: Query
SELECT customer_num FROM orders
   GROUP BY customer_num;
Query and Query return the following rows.
Figure 3: Query result
customer_num

         101
         104
         106
         110
        ;
         124
         126
         127

The GROUP BY clause collects the rows into sets so that each row in each set has the same customer numbers. With no other columns selected, the result is a list of the unique customer_num values.

The power of the GROUP BY clause is more apparent when you use it with aggregate functions.

The following query retrieves the number of items and the total price of all items for each order.
Figure 4: Query
SELECT order_num, COUNT (*) number, SUM (total_price) price
   FROM items
   GROUP BY order_num;

The GROUP BY clause causes the rows of the items table to be collected into groups, each group composed of rows that have identical order_num values (that is, the items of each order are grouped together). After the database server forms the groups, the aggregate functions COUNT and SUM are applied within each group.

Query returns one row for each group. It uses labels to give names to the results of the COUNT and SUM expressions, as the result shows.
Figure 5: Query result
 order_num      number         price 

     1001           1          $250.00
     1002           2         $1200.00
     1003           3          $959.00
     1004           4         $1416.00
     ;
     1021           4         $1614.00
     1022           3          $232.00
     1023           6          $824.00

The result collects the rows of the items table into groups that have identical order numbers and computes the COUNT of rows in each group and the SUM of the prices.

You cannot include a TEXT, BYTE, CLOB, or BLOB column in a GROUP BY clause. To group, you must be able to sort, and no natural sort order exists for these data types.

Unlike the ORDER BY clause, the GROUP BY clause does not order data. Include an ORDER BY clause after your GROUP BY clause if you want to sort data in a particular order or sort on an aggregate in the projection list.

The following query is the same as Query but includes an ORDER BY clause to sort the retrieved rows in ascending order of price, as the result shows.
Figure 6: Query
SELECT order_num, COUNT(*) number, SUM (total_price) price
   FROM items
   GROUP BY order_num
   ORDER BY price;
Figure 7: Query result
  order_num      number         price

     1010           2           $84.00
     1011           1           $99.00
     1013           4          $143.80
     1022           3          $232.00
     1001           1          $250.00
     1020           2          $438.00
     1006           5          $448.00
     ;
     1002           2         $1200.00
     1004           4         $1416.00
     1014           2         $1440.00
     1019           1         $1499.97
     1021           4         $1614.00
     1007           5         $1696.00

The topic Select specific columns describes how to use an integer in an ORDER BY clause to indicate the position of a column in the projection list. You can also use an integer in a GROUP BY clause to indicate the position of column names or display labels in the GROUP BY list.

The following query returns the same rows as Query shows.
Figure 8: Query
SELECT order_num, COUNT(*) number, SUM (total_price) price
   FROM items
   GROUP BY 1
   ORDER BY 3;

When you build a query, all non-aggregate columns that are in the projection list in the Projection clause must also be included in the GROUP BY clause. A SELECT statement with a GROUP BY clause must return only one row per group. Columns that are listed after GROUP BY are certain to reflect only one distinct value within a group, and that value can be returned. However, a column not listed after GROUP BY might contain different values in the rows that are contained in the group.

The following query shows how to use the GROUP BY clause in a SELECT statement that joins tables.
Figure 9: Query
SELECT o.order_num, SUM (i.total_price)
   FROM orders o, items i
   WHERE o.order_date > '01/01/98'
      AND o.customer_num = 110
      AND o.order_num = i.order_num
   GROUP BY o.order_num;
The query joins the orders and items tables, assigns table aliases to them, and returns the rows.
Figure 10: Query result
  order_num            (sum)

       1008          $940.00
       1015          $450.00