Estimating conventional index pages

You can estimate the size of index pages, using a series of formulas.

About this task

Procedure

To estimate the number of index pages:

  1. Add up the total widths of the indexed column or columns.

    This value is referred to as colsize. Add 4 to colsize to obtain keysize, the actual size of a key in the index. For example, if colsize is 6, the value of keysize is 10.

  2. Calculate the expected proportion of unique entries to the total number of rows.

    The formulas in subsequent steps see this value as propunique.

    If the index is unique or has few duplicate values, use 1 for propunique.

    If a significant proportion of entries are duplicates, divide the number of unique index entries by the number of rows in the table to obtain a fractional value for propunique. For example, if the number of rows in the table is 4,000,000 and the number of unique index entries is 1,000,000, the value of propunique is .25.

    If the resulting value for propunique is less than .01, use .01 in the calculations that follow.

  3. Estimate the size of a typical index entry with one of the following formulas, depending on whether the table is fragmented or not:
    1. For nonfragmented tables, use the following formula:
      entrysize = (keysize * propunique) + 5 + 4

      The value 5 represents the number of bytes for the row pointer in a nonfragmented table.

      For nonunique indexes, the database server stores the row pointer for each row in the index node but stores the key value only once. The entrysize value represents the average length of each index entry, even though some entries consist of only the row pointer.

      For example, if propunique is .25, the average number of rows for each unique key value is 4. If keysize is 10, the value of entrysize is 11.5, calculated as (10 * 0.25) + 5 + 4 = 2.5 + 9 = 11.5. The following calculation shows the space required for all four rows:

      space for four rows = 4 * 11.5 = 46

      This space requirement is the same when you calculate it for the key value and add the four row pointers, as the following formula shows:

      space for four rows = 10 + (4 * 9) = 46
    2. For fragmented tables, use the following formula:
      entrysize = (keysize * propunique) + 9 + 4

      The value 9 represents the number of bytes for the row pointer in a fragmented table.

  4. Estimate the number of entries per index page with the following formula:
    pagents = trunc(pagefree/entrysize)

    In this formula:

    • pagefree is the page size minus the page header (2020 for a 2-kilobyte page size).
    • entrysize is the size of a typical index entry, which you estimated in the previous step.

    The trunc() function notation indicates that you should round down to the nearest integer value.

  5. Estimate the number of leaf pages with the following formula:
    leaves = ceiling(rows/pagents) 

    In this formula:

    • rows is the number of rows that you expect to be in the table.
    • pagents is the number of entries per index page, which you estimated in the previous step.

    The ceiling() function notation indicates that you should round up to the nearest integer value.

  6. Estimate the number of branch pages at the second level of the index with the following formula:
    branches0 = ceiling(leaves/node_ents)

    Calculate the value for node_ents with the following formula:

    node_ents = trunc( pagefree / ( keysize + 4) + 4)

    In this formula:

    • pagefree is the page size minus the page header (2020 for a 2-kilobyte page size).
    • keysize is the colsize plus 4. You obtained this value in step 1.

    In the formula, 4 represents the number of bytes for the leaf node pointer.

  7. If the value of branches0 is greater than 1, more levels remain in the index.

    To calculate the number of pages contained in the next level of the index, use the following formula:

    branchesn+1 = ceiling(branchesn/node_ents)

    In this formula:

    • branchesn is the number of branches for the last index level that you calculated.
    • branchesn+1 is the number of branches in the next level.
    • node_ents is the value that you calculated in step 6.
  8. Repeat the calculation in step 7 for each level of the index until the value of branchesn+1 equals 1.
  9. Add up the total number of pages for all branch levels calculated in steps 6 through 8. This sum is called branchtotal.
  10. Use the following formula to calculate the number of pages in the compact index:
    compactpages = (leaves + branchtotal)
  11. If your database server instance uses a fill factor for indexes, the size of the index increases.

    The default fill factor value is 90 percent. You can change the fill factor value for all indexes with the FILLFACTOR configuration parameter. You can also change the fill factor for an individual index with the FILLFACTOR clause of the CREATE INDEX statement in SQL.

    To incorporate the fill factor into your estimate for index pages, use the following formula:
    indexpages = 100 * compactpages / FILLFACTOR

Results

The preceding estimate is a guideline only. As rows are deleted and new ones are inserted, the number of index entries can vary within a page. This method for estimating index pages yields a conservative (high) estimate for most indexes. For a more precise value, build a large test index with real data and check its size with the oncheck utility.

Tip: A forest of trees index can be larger than a B-Tree index. When you estimate the size of a forest of trees index, the estimates apply to each subtree in the index. Then, you must aggregate the buckets to calculate the total estimation.