Handling filter errors

The module writes error messages to the trace file described in the previous section. The message includes the row ID of the row that caused the error. You can use the row ID to select the offending row from the table, fix the problem with the document, and update or reinsert the row in the table with the fixed document.

Important: You can use row IDs to find rows only in a nonfragmented table. The current implementation of row IDs in fragmented tables makes it impossible for you to map the row ID returned by the DataBlade® module to an actual row in the table. Therefore, this discussion is relevant only to nonfragmented tables.

If you specify STOP_ON_ERROR for the FILTER index parameter, you cannot finish executing the statement until all filter problems have been resolved. This means that if you execute the CREATE INDEX statement to create an index on a table that currently contains rows, the index is created only when there are no filter problems with any existing documents. Similarly, if you execute the INSERT statement to insert new rows into a table that already has an etx index, the INSERT statement completes only when all the new documents have no filter problems.

If you specify CONTINUE_ON_ERROR, filtering errors do not stop the execution of the CREATE INDEX, INSERT, or UPDATE statement. If an error is encountered while filtering a document, the unfiltered document, along with any formatting information, is added to the index instead. When the statement is finished, a list of all the rows whose documents could not be filtered is written to the trace file. You can use the list of row IDs first to fix the problems with the documents and then to update the rows. As the rows in the table are updated, the corresponding entry in the index is also updated. If the original filtering problem was fixed, the new index entry contains the filtered document.

The following example shows one of many ways to correct a filtering problem after an index has been created.

Suppose you create an etx index on the abstract column of the table called reports. You also specify FILTER=CONTINUE_ON_ERROR so that the documents in the column are filtered. The abstract column is of data type BLOB and the table contains 100 rows.

Although the CREATE INDEX statement successfully finishes executing, the trace log indicates that row 65 had a filtering problem. This means that the document in the abstract column of row 65 was not filtered before it was added to the index; it was added with all its formatting information.

After you create the index, you decide to try to resolve the filtering problem in the document in row 65. You can select this row from the table, and simultaneously write the contents of the abstract column to a file on the operating system on the client computer, by executing the following SELECT statement:

SELECT LoToFile (abstract, '/tmp/outfile', 'client')
FROM my_table
WHERE rowid = 65;

You can then examine the contents of the file /tmp/outfile to find out what caused the filter error. After you have fixed this error, you can update the table, as shown in the next example. For this example, the correctly formatted document is now contained in the file /tmp/infile:

UPDATE my_table
SET abstract = FileToCLOB ('/tmp/infile' , 'client')
WHERE rowid = 65;

When you update the abstract column in the table, you are also updating the etx index built on this column. The index on the abstract column for the preceding example reflects changes to the data in row 65.