Built-in character sets

The ASCII setting means that only the alphanumeric characters 0 through 9, A through Z, and a through z are indexed. If the CHAR_SET index parameter is not specified when you create an etx index, the ASCII table is used by default.

When using the ASCII setting, words that contain international characters might actually be indexed as multiple words. For example, the word caon, if indexed with the ASCII character set, is indexed as two words, ca and on, because the non-indexed character is treated as white space.

The ISO setting means that both alphanumeric characters and ISO Latin-1 alphabetic characters are indexed. This 68-character set is to be used to index data that might contain international characters from the ISO Latin-1 set.

The OVERLAP_ISO setting means that the same character set as the ISO setting is indexed, but similar-looking characters are grouped together so that a word matches as long as corresponding letters are from the same group.