Selectivity and cost functions

For the optimizer to accurately assess the cost of using an R-Tree index, your DataBlade® module must provide selectivity and per-row cost functions. If these functions are not present, or only one of the functions is present, the cost of using an R-Tree index defaults to 50, except when the nearest neighbor strategy function is used. When the nearest neighbor strategy function is used, the server always uses the R-tree index.

Selectivity is defined as the number of rows in the result set divided by the total number of rows in the table queried (and must be between 0.0 and 1.0):

The per-row cost function calculates the cost of evaluating the predicate of the query for each row (and must be greater than or equal to 0).

For information about how to write selectivity and cost functions, see the Informix® DataBlade API Programmer's Guide, which describes how to create selectivity and cost functions for an expensive UDR. For a general description of how the query optimizer uses cost and selectivity for UDRs, see Informix User-Defined Routines and Data Types Developer's Guide.

The paper, Accurate Estimation of the Cost of Spatial Selections by A. Aboulnaga and J. F. Naughton, might also provide useful information. It is available in the proceedings of the IEEE International Conference on Data Engineering, San Diego, California, 2000.

The cost of using the R-tree index is calculated when you run UPDATE STATISTICS. See Update statistics for more information about how statistics are gathered.

You register the selectivity and per-row cost functions when you register the strategy functions for the R-tree index. For example:
-- The selectivity function for the strategy function equal
CREATE FUNCTION GeoObjectEqualSelectivity(pointer, pointer)
RETURNS float
WITH (not variant, parallelizable) EXTERNAL NAME 
'$INFORMIXDIR/extend/GEO/geodetic.bld(GeoObjectEqualSelectivity)' LANGUAGE
c;


-- The per-row cost function for the strategy function equal
CREATE FUNCTION GeoObjectEqualCost(pointer, pointer)
RETURNS int
WITH (not variant, parallelizable) EXTERNAL NAME 
'$INFORMIXDIR/extend/GEO/geodetic.bld(GeoObjectEqualCost)' LANGUAGE c;


--Register the selectivity and per-row cost functions as 
--you register the strategy function equal
CREATE FUNCTION equal(GeoObject, GeoObject) RETURNS Boolean
WITH (not variant, parallelizable,
selfunc=GeoObjectEqualSelectivity,
costfunc=GeoObjectEqualCost)
EXTERNAL NAME 
'$INFORMIXDIR/extend/GEO/geodetic.bld(GeoObjectEqual)' LANGUAGE c;
It is recommended that you specify the selectivity and per-row cost functions with each strategy function that you register. If you have already registered a strategy function and you want to add the selectivity and per-row cost functions, use the ALTER FUNCTION statement as shown in the following example:
ALTER FUNCTION Contains(GeoObject, GeoObject) WITH
(ADD selfunc= GeoObjectContainsSelectivity);

ALTER FUNCTION Contains(GeoObject, GeoObject) WITH
(ADD costfunc= GeoObjectContainsCost);
Restriction: Do not set the selectivity or per-row cost at a constant value; this will cause the cost of using an R-tree index to be set at 50. (If required, you can set your selectivity and per-row cost functions to return a constant value.)