Customizing the index life cycle

The HCL Commerce Search index life cycle provides numerous extension points. There are eight main stages to the life cycle, and each of these comes with its own extension opportunities. Your primary tool for customization is the configuration files that control the indexing process. Beyond configuration, you can build and integrate your own custom code.

The index lifecycle

Indexing (including preprocessing) is performed in the HCL Commerce server. After index preprocessing successfully completes, the Data Import Handler (DIH) can be run from the same HCL Commerce server. Solr can start creating and updating the Lucene index from the HCL Commerce temporary database tables. If preprocessing is not required, such as for the inventory extension index, the DIH process can be started either from HCL Commerce or directly from a URL issued against the HCL Commerce Search server.

There are eight steps to the indexing lifecycle itself, as indicated in the following figure.

1. Data Load - External system
PIM: Product Information Management or CMS: A Content Management System, usually Websphere Content Hub is used to load data into the Commerce authoring database.
While Data Load saves, for example, a new season's products into the workspace schema database, it also records the new season's products' primary key into search specific delta tables. Later, when buildIndex is running, it will check which data was added, changed or deleted in the workspace schema from this specific delta table. It will then build the new season's products from the workspace database into workspace index.
You can customize Data Load by defining a new mediator. buildIndex can also be customized to build additional information into the workspace index.
2. Preview
After the workspace index core has synchronized with the workspace schema database, your workspace administrators can make changes to the new season products. They can preview new as-yet unapproved data, categories and content on the storefront. The event consumer analyzer built into each catalog will add the data changes coming from the Management Center into each search-specific delta table. The buildIndex process again reads from each delta table and builds the data into a workspace index core.
3. Approve
When the workspace administrator is satisfied with the catalog data, they approve and publish the workspace. The workspace schema data will be committed into the base schema database, and the workspace index will be committed into the base index.
4. Indexing service
The indexing service will first run the preprocess script located in the Transaction server to flatten the data into a temporary table or view. Then, the service runs DataImportHandler to import data from the temporary table or view into the index.
In the base schema, by default there are four types of index: CatalogEntry, CatalogGroup, Unstructured index, and extended index (for Price and Inventory). The index can be customized by adding a new index. You can influence the behaviors of these indexes by customizing x-schema.xml, x-solrconfig.xml, and x-data-config.xml. Additional configuration files such as passwords.txt permit fine-grained control over the indexing of search terms. If you require deeper level customization, for example complete replacement of schema.xml, solrconfig.xml or data-config.xml, this could be achieved by leveraging the SRCHCONF and SRCHCONFEXT tables.
5. Stagingprop
When everything is ready in the authoring environment, the database is propagated to the live instance.
6. Indexprop
Data replication is done by the Search administration service.
7. Replication
After the index is ready on the repeater, the Solr engine's built-in replication process will replicate the index from the repeater to subordinate nodes. You can configure whether to replicate an external file or schema file, or whether to force a health check after replication. You can also customize on top of the Search-provided replication handler.
8. Cache Invalidation
A data cache is enabled on the Search server. It is used mainly to cache three types of data: database centralized operations, such as B2B entitlement information or facet configuration; search rule lookups that need a callback to the Transaction server; and category hierarchy information that is created by centralized index access. When a facet or entitlement related database, search rule or index is changed, invalidation is triggered to invalidate the data cache and storefront fragment cache. Normally entitlement, facet and search rule cache invalidation is triggered by defining a database trigger; category hierarchies are invalidated by leveraging the CACHEIVL table. Normally IndexProp will register cache invalidation events into the database in the CACHEIVL table. Later, every search REST call will call the cache manager in the Search server to read CACHEIVL and issue invalidation.
Note: While the Search and Transaction servers do invalidation using the CACHEIVL table, the Store server issues invalidations by subscribing to a Kafka message.
IBM provides three levels of support for customizing these steps:
  • Supported: The customization process is supported. For more information, see Contacting HCL Customer Support.
  • Future support: IBM intends to support the customization feature in a future release. If you have a related customization need, you can contact the IBM support team for more information.
  • Not supported: IBM is not planning to support such a customization feature in the future. You should avoid customizing the related area or look for alternatives in your implementation.

Customizing by lifecycle step

The following table links the index lifecycle steps to corresponding configuration topics in the Knowledge Center.

Most steps in the table are listed as supported or unsupported, however some items are configurable. This means that their behavior can be changed using command parameters or other built-in capabilities.
Method Customization point Level of support*
1 Data Load buildIndex utility Custom mediators; buildIndex.
2 Store preview / delta build Reindexing Customization not supported. See Index synchronization and delta updates in HCL Commerce Search.
3 Approve index Data commit to base schema. Customization not supported.
4 Customize default index schema.xml file Add customized field reusing HCL Commerce fieldType into default index schema (Including preprocess or buildindex). Customization supported. See Search index schema customization file x-schema.xml.
Add customized fieldType Customization supported. See Search index schema customization file x-schema.xml.
Modify fields with Solr native fieldType in schema.xml (such as int,string,date, float, long etc). Customization not supported.
Modify a HCL Commerce fieldType in x-schema.xml (Such as wc_text*, add StopWords to partNumber_ntk). Customization supported. See Search index schema customization file x-schema.xml
Modify a HCL Commerce fieldType in x-schema.xml for specified language (for example,to customize StopWords for english only). Customization supported. See Search index schema customization file x-schema.xml.
Add new language core for existing default index. Customization supported. See Setting up the search index.
Customize native Solr using solrconfig.xml Revert to previous versions' MultipleQueryComponent for customized extension core.
  • Return filed from extended index. For example, return price information and inventory information. Return customized information from customized extended index as well.
  • Filter by extended index field. For example, only display products that have non-zero inventory.
  • Sort by extended index filed. For example, sort by price.
  • Facet by extended index. For example, display price facet from 0 to 100.
  • Boost by extended index. For example, boost products that have inventory larger than 100.
  • Customize a Solr function to do sorting.
Customization supported.
Register a customized query parser. Customization supported. See Customizable components of the final Solr query.
Register a function parser. Customization supported. See Enabling search on additional unstructured content types.
Register a transform during buildindex. Customization supported. See The indexing process.
Customize IndexSearcher related event listener. Customization not supported.
Customize update related event listener. Customization not supported.
Customization on wc-data-config.xml Customization of existing field mappings in wc-data-config.xml for CatalogEntry/CatalogGroup/unstructured/Inventory/Price. Customization not supported.
Add a new field mapping in wc-data-config.xml for CatalogEntry, CatalogGroup, Unstructured, Inventory, or Price. Customization supported. See Extending the wc-data-config.xml file using the wc-data-preprocess-x-finalbuild.xml file.
Choose non-ATP or DOM inventory for wc-data-config.xml. Customization supported. See Search properties in the component configuration file (wc-component.xml).
Completely override the default wc-data-config.xml. Customization supported. See Extending the wc-data-config.xml file using the wc-data-preprocess-x-finalbuild.xml file.
Customize Index Load Indexload for master CatalogEntry or CatalogGroup. Customization not supported.
Indexload for customized extension core. Customization not supported.
Customization to Sharding Shard on a core other than catentry. Customization not supported.
5 Propagate staged data Modify command-line parameters of the StagingProp utility. Configurable using the StagingProp utility. See stagingprop utility.
6 Index propagation Modify command-line parameters of the IndexProp utility. Configurable using the IndexProp utility. See Propagating the search index.
7 Replication, operation and healthcheck Perform replication on a customized index, for example replicate from a master to a repeater, or from a repeater to a subordinate. Customization not supported.
Replicate on an external file for a customized index. Customization not supported.
Customize existing replication, operation, or healthcheck functions. Customization supported. See Index verification extension points
8 Cache and cache invalidation Cache invalidation by CACHEIVL table for a customized index. Configurable. See Cache invalidation.
Replace with other centralized cache provider (such as WXS). Customization not supported.