Example: Customizing the search index schema

You can customize the search index schema, where in general, it affects the WebSphere Commerce runtime and index-building utilities.

Note: The deployed search server configuration files are not automatically migrated, including the schema.xml, solrconfig.xml, and wc-data-config.xml files. These files must manually be merged.

For more information, see Migrating WebSphere Commerce search.

Procedure

Open the search index schema file:

WCDE_installdir\search\solr\home\masterCatalogId\en_US\Catalogentry\conf
WC_installdir/instances/instance_name/search/solr/home/masterCatalogId/en_US/Catalogentry/conf

This directory contains the Master Catalog folder, in which there are the configurations files for each language.

Update the Solr configuration files directly. For example, by updating schema.xml instead of creating separate configuration files for customization.
New index fields can be added to the search index schema. The field name should start with the prefix xf_, as this convention prevents naming conflicts between customization properties and default WebSphere Commerce properties.

The analyzers and tokenizers used in WebSphere Commerce default index field types can be replaced with custom analyzers or third-party analyzers. The order of the analyzers can also be customized, however, ensure that the new analyzers function properly and are compatible with other analyzers.

For example, the analyzers in the following field type can replaced with custom analyzers or third-party analyzers:


<fieldType name="wc_text" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt" />
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt" />
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>

You should have advanced knowledge of analyzers and tokenizers before changing the analyzer settings. It is recommended that you follow the general recommendations for Solr when setting field types. For example, using tokenized fields for search, and untokenized fields for sorting or faceting.

The following naming conventions are used in WebSphere Commerce:

fieldName: Tokenized and not case sensitive for example, mfName.
fieldName_cs: Tokenized and case sensitive for example, mfName_cs.
fieldName_ntk: Non-tokenized and not case sensitive for example, mfName_ntk.
fieldName_ntk_cs: Non-tokenized and case sensitive for example, catenttype_id_ntk_cs.

The spellCheck field can be customized to add to or remove content from it to improve the spell check functionality.

For example, you can add more index fields to spell check or remove index fields from it:


<!-- Copy fields for spell check -->
 <copyField source="name" dest="spellCheck"/>
 <copyField source="mfName" dest="spellCheck"/>
 <copyField source="shortDescription" dest="spellCheck"/>
 <copyField source="keyword" dest="spellCheck"/>

The defaultSearch field can be customized to add to or remove content from.

For example, you can add more index fields to the defaultSearch field:


<!-- Copy fields for default search field -->
 <copyField source="name" dest="defaultSearch"/>
 <copyField source="shortDescription" dest="defaultSearch"/>
 <copyField source="partNumber_ntk" dest="defaultSearch"/>
 <copyField source="keyword" dest="defaultSearch"/>
 <copyField source="cas_f*" dest="defaultSearch"/>
 <copyField source="cai_f*" dest="defaultSearch"/>
 <copyField source="caf_f*" dest="defaultSearch"/>
 <copyField source="ads_f*" dest="defaultSearch"/>
 <copyField source="adi_f*" dest="defaultSearch"/>
 <copyField source="adf_f*" dest="defaultSearch"/>


<copyField source="name" dest="defaultSearch"/>
<copyField source="shortDescription" dest="defaultSearch"/>
<copyField source="partNumber_ntk" dest="defaultSearch"/>
<copyField source="keyword" dest="defaultSearch"/>
<copyField source="cas_f*" dest="defaultSearch"/>
<copyField source="cai_f*" dest="defaultSearch"/>
<copyField source="caf_f*" dest="defaultSearch"/>
<copyField source="nameOverride" dest="defaultSearch"/>
<copyField source="shortDescriptionOverride" dest="defaultSearch"/>
<copyField source="keywordOverride" dest="defaultSearch"/>
<copyField source="categoryname" dest="defaultSearch"/>

When customizing the preprocessor functionality, do not change existing pre-processing configuration files.
Newly created custom preprocess configuration files must start with wc-dataimport-preprocess and end in .xml.
For example, wc-dataimport-preprocess-XXXXX.xml, wc-dataimport-preprocess-custom-listprice.xml.