HASH ON clause

Use the HASH ON clause of the CREATE INDEX statement to specify the columns and number of subtrees (buckets) for a forest of trees index.

HASH ON clause

1  HASH ON ( + , column ) WITH number BUCKETS
Element Description Restrictions Syntax
column The name of the column or columns on which you use the HASH ON clause to create a forest of trees index The list must be a prefix list of the index columns used in the CREATE INDEX statement Identifier
number The number of subtrees (buckets) to create for a forest of trees index The number of buckets for a forest of trees index must range from 2 to the number of available index pages per dbspace Integer Literals

Usage

Forest of trees indexes are detached indexes. They cannot be attached indexes.

You can create forest of trees indexes on columns with base data types.

You cannot:

  • Create forest of trees indexes on columns with complex data types, UDTs, or functional columns.
  • Use the FILLFACTOR option of the CREATE INDEX statement when you create forest of trees indexes, because the indexes are built from top to bottom.
  • Create clustered forest of trees indexes.
  • Run the ALTER INDEX statement on forest of trees indexes.
  • Use forest of trees indexes in queries that use aggregates, including minimum and maximum range values
  • Perform range scans directly on the HASH ON columns of a forest of trees index.

    However, you can perform range scans on columns that are not listed in the HASH ON column list. For range scans on columns listed in HASH ON column list, you must create an additional B-tree index that contains the appropriate column list for the range scan. This additional B-tree index might have the same column list as the forest of trees index, plus or minus a column.

  • Use a forest of trees index for an OR index path. The database server does not use forest of trees indexes for queries that have an OR predicate on the indexed columns.

When you create a forest of trees index, choose enough columns to create unique values.

Tip: Generally, the columns to choose depend on the number of duplicates for each column. For example, if the first column contains a small number of duplicates, the first two columns are sufficient for hashing if they do not contain a large number of duplicates. If the first two columns contain a majority of duplicates, then you need to also choose a third column.

The number of subtrees depends on your goal for the index. If your goal is:

  • To reduce contention, initially create a forest of trees index with 2 subtrees per CPU VP. You might need more subtrees, depending on the number of rows in the table and how many duplicates exist.
  • To reduce the number of levels in the B-tree:
    1. Run the oncheck -pT command.
    2. In the output, find the number of nodes at each level.
    3. Determine how many subtrees are required to achieve the desired depth for each tree in the index.

For example, suppose an index averages 100 keys per page, the index has 1M keys, and the tree looks like this:

  • Level 1 (root) 100 keys
  • Level 2 10K keys
  • Level 3 1M keys

To reduce the 3-level tree to 100 2-level trees, the index needs roughly 100 subtrees. To reduce the 3-level tree to 10K 1-level trees, the index needs roughly 10K subtrees.

Forest of tree pages can be sparser than traditional B-tree pages if too many or too few subtrees are used. When the pages are sparser, more pages occupy the buffer pool, and therefore, cause other tables to become less cached.

Examples

The following command creates a forest of trees index named idx1 with 100 subtrees on column c1:

CREATE INDEX idx1 ON tab1(c1) HASH ON (c1) with 100 buckets;

The following command creates a forest of trees index named idx2. In the command, the prefix list for the HASH ON portion of the statement is c1 and c2, which is a prefix list of the c1, c2, and c3 columns used in the CREATE INDEX portion of the statement.

CREATE INDEX idx2 on tab2(c1, c2, c3) HASH ON (c1, c2) with 10 buckets;

The following command creates a forest of trees index for equality lookups on columns c1 and c2:

CREATE INDEX idx3 on tab3(c1, c2) HASH ON (c1, c2) with 100 buckets;

The following command creates a B-tree index that is similar to the previous forest of trees index. This index is for range scans on columns c1 and c2:

CREATE INDEX idx4 on tab4(c1, c2, c3);