Choosing a distribution scheme

When choosing a distribution scheme, you must consider the ease of data balancing, whether you want fragments to be eliminated, and the effect of the data skip feature.

Distribution-Scheme Comparisons compares round-robin and expression-based distribution schemes.

Table 1. Distribution-Scheme Comparisons
Distribution Scheme Ease of Data Balancing Fragment Elimination Data Skip
Round-robin Automatic. Data is balanced over time. The database server cannot eliminate fragments. You cannot determine if the integrity of the transaction is compromised when you use the data-skip feature. However, you can insert into a table fragmented by round-robin.
Expression-based Requires knowledge of the data distribution. If expressions on one or two columns are used, the database server can eliminate fragments for queries that have either range or equality expressions. You can determine whether the integrity of a transaction has been compromised when you use the data-skip feature. You cannot insert rows if the appropriate fragment for those rows is down.
The distribution scheme that you choose depends on the following factors:
  • The features in Distribution-Scheme Comparisons of which you want to take advantage
  • Whether or not your queries tend to scan the entire table
  • Whether or not you know the distribution of data to be added
  • Whether or not your applications tend to delete many rows
  • Whether or not you cycle your data through the table

Basically, the round-robin scheme provides the easiest and surest way of balancing data. However, with round-robin distribution, you have no information about the fragment in which a row is located, and the database server cannot eliminate fragments.

In general, round-robin is the correct choice only when all the following conditions apply:
  • Your queries tend to scan the entire table.
  • You do not know the distribution of data to be added.
  • Your applications tend not to delete many rows. (If they do, load balancing can be degraded.)
An expression-based scheme might be the best choice to fragment the data if any of the following conditions apply:
  • Your application calls for numerous decision-support queries that scan specific portions of the table.
  • You know what the data distribution is.
  • You plan to cycle data through a database.

If you plan to add and delete large amounts of data periodically, based on the value of a column such as date, you can use that column in the distribution scheme. You can then use the alter fragment attach and alter fragment detach statements to cycle the data through the table.

The ALTER FRAGMENT ATTACH and DETACH statements provide the following advantages over bulk loads and deletes:
  • The rest of the table fragments are available for other users to access. Only the fragment that you attach or detach is not available to other users.
  • With the performance enhancements, the execution of an ALTER FRAGMENT ATTACH or DETACH statement is much faster than a bulk load or mass delete.

For more information, see Improve the performance of operations that attach and detach fragments.

In some cases, an appropriate index scheme can circumvent the performance problems of a particular distribution scheme. For more information, see Strategy for fragmenting indexes.