Aggregate expressions in grid queries

Aggregate expressions in grid queries require the database server that issues the grid query to combine aggregate results from multiple grid servers.

Grid queries are distributed SELECT statements that return the logical UNION or UNION ALL on qualifying rows of database tables that have identical schemas in databases of a grid. Grid queries are described in the GRID clause topic and in the HCL OneDB™ Enterprise Replication Guide.

Aggregate expressions in grid queries require the database server that issues the Grid query to combine aggregate results from multiple grid servers. Because the Grid query that you specify is converted into a UNION or UNION ALL query across multiple grid servers, each aggregate is calculated independently on each participating server within the specified grid or region. These individual aggregates are returned to the database server that issued the grid query, which calculates their combined UNION or UNION ALL value.

In order to calculate a global aggregation from these individual aggregates, the grid query must be a subquery. For example, suppose we wanted to see for the southeast region (SW_USA in the following example) the total sales and the average sale. Because it is not correct to take the average of a group of averages if the number of qualifying rows varies across data sets within the grid, the correct grid query needs to resemble this:

SELECT SUM(amt) AS total_sales , 
       SUM(amt) / SUM(cnt) AS avg_sale FROM
	(
	SELECT COUNT(*) cnt, SUM(amt) amt
	FROM sales GRID ALL 'SW_USA'
	);

     total_sales         avg_sale
 
        $8300.00          $103.75

In this example, the projection list of the grid query specifies a COUNT(*) expression (aliased as cnt) and an aggregate amount SUM(amt) (aliased as amt) from the sales table of each grid server within the southeast region (encoded as 'SW_USA' ). Using these values from each of the remote grid servers, the local grid server can calculate an average for the region. This is accomplished by adding up the total sales by evaluating the expression SUM(amt) and dividing that value by the total count SUM(cnt), where cnt and amt are columns in the result set of the grid subquery.

Note that the GRID ALL keywords specify a UNION ALL of the qualifying rows from each participating grid server, to avoid eliminating duplicate rows.