The COMBINE function

The COMBINE aggregate support function allows your user-defined aggregate to execute in a parallel query.

When a query that contains a user-defined aggregate is processed in parallel, each parallel thread operates on a one subset of selected rows. The COMBINE function merges the partial results from two such subsets. This aggregate support function ensures that the result of aggregating over a group of rows sequentially is the same as aggregating over two subsets of the rows in parallel and then combining the results.

The COMBINE function is required for parallel execution. When a query includes a user-defined aggregate, the database server uses parallel execution when the query includes only aggregates. However, the COMBINE function might be used even when a query is not parallelized. For example, when a query contains both distinct and non-distinct aggregates, the database server can decompose the computation of the non-distinct aggregate into sub-aggregates based on the distinct column values. Therefore, you must provide a COMBINE function for every user-defined aggregate.

If you do not define an COMBINE function for your user-defined aggregate, the database server generates an error. However, if your user-defined aggregate uses a simple state, the COMBINE function can be the same as the ITER function. For more information, see Handling a simple state.

To declare a COMBINE function as a C function, use the following syntax:

agg_state combine_func(agg_state1, agg_state2)
   agg_state agg_state1, agg_state2;

agg_state: The data type of the two partial aggregate states (agg_state1 and agg_state2) as well as the updated aggregate state, which the COMBINE function returns.
agg_state1: The aggregate state from one parallel thread.
agg_state2: The aggregate state from the second parallel thread.
combine_func: The name of the COMBINE aggregate support function.

In the execution of a UDA, the database server calls the COMBINE once for each pair of threads (agg_state1 and agg_state2) that execute a parallel query that contains the user-defined aggregate. When the COMBINE function combines two partial results, it might also need to release resources associated with one of the partial results.

The following code fragment shows the COMBINE aggregate support function that handles an INTEGER argument for the SQSUM user-defined aggregate (which A sample user-defined aggregate describes).

Figure 1: The COMBINE aggregate support function for SQSUM1 on INTEGER

/* SQSUM1 COMBINE support function on INTEGER */
mi_integer combine_sqsum1(state1, state2)
   mi_integer state1, state2;
{
   /* Return the new partial sum from two parallel partial
    * sums
    */
   state1 += state2;
   return (state1);
}

For other aggregate support functions of SQSUM1, see The INIT aggregate support function for SQSUM1 on INTEGER, The ITER aggregate support function for SQSUM1 on INTEGER, and The FINAL aggregate support function for SQSUM1 on INTEGER. For more information aboutparallel execution of a UDA, see Execute a user-defined aggregate in parallel queries.