Configuring the attachment types that can be full-text indexed

You can control which attachment file types to allow to be full-text indexed.

About this task

By default, all file formats that are supported by Tika 1.24.1 are full-text indexed with the exception of the following ones:
.au, .bqy, .cca, .dbd, .dll, .exe, .gif, .gz, .img, .jar, .jpg, .mov, .mp3,.mpg,
 .msi,.nsf, .ntf, .p7m, .p7s,.pag, .pdb, .png, .rar, .sys, .tar, .tif, .wav, .wpl, .z, .zip. 
Use notes.ini settings to define your own list of attachment types to allow to be full-text indexed.
Tip:
  • To deploy the notes.ini settings on multiple servers, use a server Configuration document in the directory. In the Basics tab, for Use these settings as the default settings for all servers select Yes, and then add the setting or settings to the NOTES.INI tab of the document
  • To deploy the notes.ini settings on multiple Notes clients, use the Custom Settings > notes.ini tab of a Desktop Settings policy document.

Procedure

  1. To define your own list of attachment types to allow for full-text indexing, add the following notes.ini setting to a Domino server or Notes client:
    FT_USE_MY_ATTACHMENT_WHITE_LIST=1
    Note that this setting overrules the default behavior and no attachments can be full-text indexed until you complete the next step.
  2. Use the notes.ini settings in the following table to configure which attachment types to full-text index.
    Table 1. Configuring the attachments types to allow to be full-text searched
    Goal Settings to use
    Configure which file types to allow on all databases.

    FT_INDEX_FILTER_ATTACHMENT_TYPES=*.<format>,*.<format> where <format> is a file format. Use a comma between formats.

    FT_INDEX_FILTER_ATTACHMENT_TYPES_MAX_MB=<value> where <value> is an optional maximum attachment size in MB to limit the size of files that can be searched. There is no limit if not specified.

    For example:
    FT_INDEX_FILTER_ATTACHMENT_TYPES=*.mp3
    FT_INDEX_FILTER_ATTACHMENT_TYPES_MAX_MB=2
    
    Configure which file types to allow on a specific database.

    FT_INDEX_FILTER_ATTACHMENT_TYPES_<replicaID>=*.<format> where <replicaID> is the replica ID of a database to search and <format> is the file type. Use a comma between formats.

    FT_INDEX_FILTER_ATTACHMENT_TYPES_<replicaID>_MAX_MB=1 where <replicaID> is the replica ID of the database to search and <value> is an optional maximum attachment size in MB to limit the size of files that can be searched. There is no limit if not specified.

    For example:
    FT_INDEX_FILTER_ATTACHMENT_TYPES_00124866492581EC=*.txt
    FT_INDEX_FILTER_ATTACHMENT_TYPES_00124866492581EC_MAX_MB=1
    Note: If you currently use the notes.ini setting FT_USE_ATTACHMENT_WHITE_LIST=1 on a Domino server or Notes client, note the following behavior:
    • The following file types are allowed for full-text indexing by default. Note that this whitelist overrules the blacklisted formats .zip and .jar and allows them to be indexed. The other types in the blacklist are still disallowed.
      *.123,*.ami,*.ap,*.as,*.aw,*.dca,*.doc*,*.dwg,*.emf,*.emz,*.fff,*.fft,*.flg,*
      .fm,*.htm*,*.hwp,*.jar,*.jtd,*.jtt,*.lwp,*.mime,*.oas,*.odp,*.ods,*.odt,*.pdf*,*.pic,
      *.ppt*,*.pst,*.qpw,*.r13,*.r14,*.rtf,*.sam,*.shw,*.swp,*.vsd*,*.wk4,*.wks,*.wmf,*.wp*,
      *.wri,*.xlr,*.xls*,*.xml,*.xy*",*.zip
    • You can use the notes.ini settings described in Step 2 along with this setting to allow additional attachment types.
    • Switching to FT_USE_MY_ATTACHMENT_WHITE_LIST=1 overrules this setting.