Character sets

Character sets define the characters that are indexed when an etx index is built on a column that contains text documents. Any character in the text document not found in the specified character set is treated as white space in the index. The character is not changed in the text document itself.

Specify a character set by setting the CHAR_SET index parameter to the name of the character set when you create an etx index with the CREATE INDEX statement.

The module provides three built-in character sets: ASCII, ISO, and OVERLAP_ISO. You can also define your own character set if the ones provided are not adequate for your text documents.
Important: Different character sets support different numbers of characters in an index. For a character set that supports more characters, fewer characters are searched in a clue word, and vice versa.

The following sections describe the built-in character sets and explain when and how to create your own.