Fragment indexes

Fragmenting indexes typically improves performance of text searches, particularly on multiprocessor computers. However, improperly fragmented indexes can degrade performance of certain queries. Carefully consider how your likely data and your likely queries might be affected by fragmentation.

As with all databases and indexing methods, spreading the text data and the etx index over multiple disk drives can improve performance. If enough disk drives are available, it is also beneficial to spread the table data into multiple dbspaces or sbspaces by using either expression-based or round-robin fragmentation.

Expression-based fragmentation is the best mechanism for achieving smaller index fragments because each index fragment functions as an isolated index and incurs open costs proportional to the size of just that fragment. Do not use round-robin fragmentation for the index, because it can result in worse performance than no fragmentation at all.

Text searches are performed with the etx_contains() operator, which is executed as an index scan. The key to lowering index scan costs is either to reduce the number of rows selected through the index scan by using selective search criteria or to shrink the size of the index fragment being read.

As an example, consider a simple table with the following definition:
CREATE TABLE recipes
(
    id         INTEGER,
    recipe      CLOB,
    meal_type      CHAR(1),
    ingredients      LVARCHAR
);
Suppose that most of the time the users are interested in particular recipes for meals at a certain time of the day: they do not want to eat pancakes for dinner or lasagna for breakfast. Query performance is improved if the index data is fragmented according to the meal_type column, as shown in the following example:
CREATE INDEX recipes_idx ON recipes (recipe etx_clob_ops)
    USING etx (PHRASE_SUPPORT = 'MAXIMUM')
    FRAGMENT BY EXPRESSION
    meal_type = 'B' in sbsp1,
    meal_type = 'L' in sbsp2,
    meal_type = 'D' in sbsp3;
To find dinner menus that use zucchini, the SQL statement would look like this:
SELECT id FROM recipes
    WHERE etx_contains ( recipe, 'zucchini')
    AND meal_type = 'D';

This immediately eliminates fragments from consideration and results in faster query execution.

Consider the impact of fragmentation on query performance and concurrency when users update data as well. You want to lock the index for as short a time as possible. HCL OneDB™ exclusively write-locks a smart large object when it is being written to. Since text indexes are stored in smart large objects, each index fragment is locked exclusively while it is being updated.

Fragment your index into no more than six fragments. Experiment with tuning for your specific DBMS to determine the optimal number of index fragments for your data.

Excessive index fragmentation can degrade performance in certain situations. For example, if your index is fragmented into many fragments and a query retrieves information from every one, the overall performance of the query might be worse than if the index had not been fragmented at all.