thesaurus index parameters

You can create a thesaurus so that basic text searches return synonyms as well as exact matches of specified words. A thesaurus is useful if your text data has multiple words for the same information.

thesaurus index parameters

{ thesaurus=" yes" | thesaurus_index ="thesaurus_index" }
Element Description
thesaurus_index The name of the bts index that is created on the thesaurus table.

Usage

People's names is an example of the type of data that can benefit from a thesaurus. Because people can have nicknames, multiple names for the same person might exist in the database. If you define a thesaurus for common nicknames, your basic text queries can return more accurate results. Synonyms are not used in a query if the query includes the following term modifiers: wildcard, fuzzy, proximity, or range query.

When you include a thesaurus in your bts index definition, basic text queries include all synonyms for specific search terms. For example, if you define mark, marc, marcus, and marco as synonyms, when you query for any one of these names the query is rewritten to include all of them:
'(mark OR marc OR marcus OR marco)'

To create a thesaurus:

  1. Create the thesaurus table with a text column for the synonym data. You can use any of the data types that are supported by the bts index.
  2. Add the synonym data to the thesaurus table. Each value for the synonym data column is a list of words that you want to be treated as synonyms. You can create synonyms for only single words. You cannot create synonyms for phrases.
  3. Create a bts index on the thesaurus table. Include the thesaurus="yes" parameter.

When you create the bts index on the table that contains the text data, follow these rules:

  • Specify the synonym data column as the column to index.
  • Include the thesaurus_index="thesaurus_index" parameter, specifying the thesaurus index that you created.
  • Set the query_default_operator index parameter to "OR" or omit the parameter.

You can dynamically update your thesaurus without rebuilding the basic text search index by updating the thesaurus table.

Example

Suppose that you create a table called mytbl with the following statements:

CREATE TABLE mytbl(name char(30));
INSERT INTO mytbl(name) VALUES('mark');
INSERT INTO mytbl(name) VALUES('elizabeth');
INSERT INTO mytbl(name) VALUES('marco');
INSERT INTO mytbl(name) VALUES('beth');

You create a thesaurus table named mythesaurus and add synonym data to it:

CREATE TABLE mythesaurus(synonyms lvarchar);
INSERT INTO mythesaurus(synonyms) 
      VALUES('elizabeth liz beth eliza leisal betty liza');
INSERT INTO mythesaurus(synonyms) 
      VALUES('mark marc marcus marco');

You create a bts index on the thesaurus table:

CREATE INDEX mythesaurus_index
ON mythesaurus(synonyms bts_lvarchar_ops)
USING bts(thesaurus="yes");

You create a bts index that uses the thesaurus on the table mytbl:

CREATE INDEX name_index
ON mytbl(name bts_char_ops)
USING bts(thesaurus_index="mythesaurus_index");

Now when you search for the name elizabeth, the query returns both the exact match and the synonym beth:

SELECT * FROM mytbl WHERE bts_contains(name, 'elizabeth');

name
elizabeth
beth
2 row(s) retrieved.

When you search for both marcus or liza, the query returns four synonyms but no exact matches:

SELECT * FROM mytbl WHERE bts_contains(name, 'marcus or liza');

name
mark
marco
elizabeth
beth
4 row(s) retrieved.