Data distribution statistics

Data distribution statistics are stored in the system catalog for use by the query optimizer when it designs query execution plans. These statistics, together with other information, enable the optimizer to estimate the relative costs among the execution plans that the optimizer is considering for a specific query. Distribution statistics that the optimizer examines for tables that are referenced in queries can include column distribution statistics for the table and for its indexes, as well as fragment-level statistics, if the database server has gathered statistics for individual table or index fragments.

The following system catalog tables store data distribution information that is available to the query optimizer:

SYSDISTRIB
Stores data distribution information for tables and indexes.
SYSFRAGDIST
Stores fragment-level data distribution information for fragments of tables and indexes.

The following system catalog tables store information pertaining to changes to rows since the most recent update to table, index, or fragment statistics.

SYSDISTRIB
Counts the number of rows changed by DML operations since table statistics were last updated, the date and time of that update, and the time required to build column distributions.
SYSFRAGDIST
Counts the number of rows changed by DML operations since fragment-level statistics were last updated, and the date and time of that update.
SYSFRAGMENTS
Counts the number of rows changed by DML operations since fragment-level statistics were last updated.
SYSINDICES
Counts the number of rows changed by DML operations since index statistics were last updated, the date and time of that update, and time required to build low level distributions for the lead column of the index.

The following configuration parameters can affect the database server behavior for the calculation, display, or other operations on data distribution statistics for tables or for fragments that can be used in query plans:

AUTO_STAT_MODE
Enable or disable the detection (and selective refreshing) of stale statistics during UPDATE STATISTICS operations. You can override the setting of this parameter by using the onmode -wm or onmode -wf command-line utilities, or SQL administration API function calls, or (for the current session) by the SET ENVIRONMENT AUTO_STAT_MODE statement of SQL.
EXPLAIN_STAT
Enable or disable the inclusion of a Query Statistics section in the explain output file. This is enabled by default.
SYSSBSPACENAME
Specifies the name of the sbspace in which the database server stores data-distribution statistics (as smart large objects) that the UPDATE STATISTICS statement collects for certain user-defined data types. Because the data distributions for UDTs can be large, you have the option to store them in an sbspace instead of in the sysdistrib system catalog table (for table-level statistics) or in the sysfragdist system catalog table (for fragment-level statistics), where distribution statistics are stored by default.
STATCHANGE
Specifies a positive integer as a change threshold to identify table or fragment distribution statistics that need to be updated. This is the default threshold for refreshing distribution statistics on tables for which no specific threshold has been specified as a table or session attribute. If no value is specified, the default is 10. While selective refreshing of data distribution statistics enabled (by default, or by the AUITO_STAT_MODE setting, or by the AUTO keyword of the UPDATE STATISTICS statement, UPDATE STATISTICS operations only refresh stale or missing statistics. The default value of 10 restricts recalculation to only the tables or fragments in which DML, load, or TRUNCATE operations have changed more than 10% of the rows since data distribution statistics were most recently calculated.