Collect statistics data

The UPDATE STATISTICS statement collects statistics about the tables in your database. It automatically collects statistics for all columns with built-in data types (except TEXT and BYTE). However, it cannot automatically collect statistics for columns with user-defined data types because it does not know the structure of these data types.

For UPDATE STATISTICS to collect statistics for a column with a user-defined data type, you must write a user-defined function named statcollect() that collects statistics data for your user-defined data type. The UPDATE STATISTICS statement takes the following steps for columns of user-defined data types:
  • Calls the statcollect() function that handles the user-defined data type

    This statcollect() function gathers the statistics data for the column and stores it as the stat opaque data type.

  • Stores this stat data type in the sysdistrib system catalog table, where the statistics data can be accessed by the query optimizer
    UPDATE STATISTICS stores the following information in the row of the sysdistrib table that corresponds to the user-defined-type column:
    • In the encdat column of the sysdistrib row: the stat data type that statcollect() returns
    • In the type column of the sysdistrib row: an 'S' to indicate that the encdat column contains user-defined statistics
To have the UPDATE STATISTICS statement collect statistics for your user-defined data type, you must:
  • Design the statistics information that is appropriate for your user-defined data type.
  • Define a C statistics-collection function to implement the statistics collection.
  • Collect the statistics for the column within this statistics-collection function.
  • Register this C function as a statcollect() user-defined function.

If a statcollect() function does not exist for your user-defined data type, UPDATE STATISTICS does not collect statistics data for any column of that type.