WebSphere Commerce Search extension indexes

Extension indexes, set up as index subtypes in WebSphere Commerce Search, are used to keep data in a separate core for performance reasons.

The following extension indexes are available by default:
Inventory
The inventory index, a separate index that contains index data, is an extension of the product index. For accurate inventory status, you can refresh the inventory index more frequently than the product index.
Price
The price index, a separate index that contains price data. Prices are indexed by using Index Load. It can populate a large amount of data into a separate extension index faster than the Catalog Entry index can index price data.

You can use the default extension indexes, or setup your own extension index that best match your site's requirements.

General considerations

The WebSphere Commerce Search query component extension tries to replicate most of the Solr supported search features when you work with an extension index. However, due to the complexity of the logic that is involved at run time, the following list describes the general supported feature specification for extension indexes:

Schema design:
  • For the base index to be able to reference an extension index, the extension index schema must define what is similar to a foreign key. It matches the unique field name and type in the base index schema. The referenced field data type must be a simple data type such as String, Integer, or float. It must match the unique key name and type of the base index.
  • Avoid common field names between extension indexes and the base index, other than the referenced field. It is recommended to use a naming convention that prefixes the extension index fields to avoid naming collisions.
Searching:
  • The q parameter is the only mandatory query parameter. It must be a query that is specified in SolrQuerySyntax.
  • Any index column, including index columns from an extension index, can be used in a query expression. However, there is a performance degradation when performing a cross-index query condition, which is typically not recommended.
Filtering:
  • The fq parameter can be used to specify a query that can be used to restrict the super set of documents that can be returned, without influencing the relevancy score. It can be useful for speeding up complex queries, since the queries specified with fq are cached independently from the main query.
  • This parameter can be specified multiple times in the same request. Documents are included in the result only if they are in the intersection of the document sets resulting from each fq.
  • Filter queries can be complicated Boolean queries, but fields that are involved must belong to the same index. That is, cross core filter queries are not supported.
  • The document sets from each filter query are cached independently.
  • Special characters must be properly URL escaped, as with all parameters when expressed in a URL.
Faceting:
  • Faceting is done on indexed values, rather than stored values. This is because the primary use for faceting is to select a subset of hits that result from a query, so the chosen facet value is used to construct a filter query that literally matches the value in the index, while the stored value is for display purposes only.
  • Many faceting parameters can be overridden on a per-field basis, by using the following syntax: f.fieldName.parameterName=parameterValue.
  • Specific filters can be tagged or excluded when faceting. Typically tagging or exclusion are needed when multiple facets are selected. However, tagging and exclusion within the same query are restricted to fields from the same core. That is, tagging and exclusion within the same query that involves fields from more than a single core is not supported.
  • Pivot facet, facet by date, and facet by range are not supported.
Sorting:
  • A sort order must include a field name, either from the base index or extension index, followed by white space, followed by a sort directional operator (ascending or descending).
  • Only field names can be used. Function names or sorting by docId is not supported.
  • Multiple sort operators can be separated by a comma. When more than one sort criteria is provided, the second entry is used only if the first entry results in a tie. If there is a third entry, it is used only if the first and second entries are tied. This pattern continues with further entries.
Grouping:
  • Result grouping arranges documents with a common field value into groups, returning the top documents per group, and the top groups based on what documents are in the groups. Grouping can be performed only against index columns from the base index. That is, grouping by extension index fields is not supported.
Joining:
  • Join operations can be used only against the base index. Joins with an extension index is not supported.
  • Fields or other properties of the documents that are joined from are not available for use in processing of the resulting set of to documents. That is, you cannot return fields in the from documents as if they are a multivalued field on the to documents.
  • The join query produces constant scores for all documents that match. The scores that are computed by the nested query for the from documents are not available to use in scoring the to documents.

Common query parameters

The following list describes the common query parameters and restrictions when specified against the base index:
start
Paginates results from a query. When specified, it indicates the offset in the complete result set for the queries where the set of returned documents begins. The default value is 0.
rows
Paginates results from a query. It specifies the maximum number of documents from the complete result set to return to the client for every request. You can consider it as the maximum number of results that appear in the page.
fl (fields)
Specifies a set of fields to return, limiting the amount of information in the response. Any index column, including index columns from an extension index, can be declared in this parameter for returning the stored value of the corresponding index field. When no index field is provided, or a * is provided, all stored static index fields from the base (CatalogEntry) index are returned. Dynamic fields and other extension index fields must be explicitly declared to be returned.
Note: Use of the fl parameter can result in considerable performance degradation.
facet
Determines the Simple Faceting behavior, which is grouped by the type of faceting they support. Setting this parameter to true enables facet counts in the query response. That default value is false, which disables faceting.
facet.query
Specifies an arbitrary query in the Lucene default syntax to generate a facet count. By default, faceting returns a count of the unique terms for a field, while facet.query determines counts for arbitrary terms or expressions. This parameter can be specified multiple times to indicate that multiple queries are used as separate facet constraints.
facet.field
Specifies a field that is treated as a facet. This parameter can be specified multiple times to indicate multiple facet fields.
facet.prefix
Limits the terms on which to facet, starting with the specified string prefix. Unlike fq, it does not change the search results; it merely reduces the facet values returned to those beginning with the specified prefix. This parameter can be specified only on a per-field basis, and has no additional effect on other faceting fields.
facet.sort
Determines the ordering of the facet field constraints. The following values can be used:
  • count: sort the constraints by count (highest count first).
  • index: return the constraints that are sorted in their index order (lexicographic by indexed term).
For terms in the ASCII range, it is alphabetically sorted. The default sort is by count when facet.limit is greater than 0. Otherwise, the default is set to index. This parameter can be specified only on a per-field basis and has no additional effect on other faceting fields.
facet.limit
Indicates the maximum number of constraint counts to be returned for the facet fields. A negative value denotes unlimited. The default value is 100. This parameter can be specified on a per-field basis to indicate a separate limit for certain fields.
facet.offset
Indicates an offset into the list of constraints to allow paging. The default value is 0. This parameter can be specified only on a per-field basis and has no additional effect on other faceting fields.
facet.mincount
Indicates that the minimum counts for facet fields to be included in the response. The default value is 0. This parameter can be specified only on a per-field basis and has no additional effect on other faceting fields.
qf (query fields)
Provides a list of index fields and the boost factor to associate with each of them when building DisjunctionMaxQueries from the search request. The supported format is fieldOne^2.3 fieldTwo fieldThree^0.4, which indicates that fieldOne has a boost of 2.3, fieldTwo has the default boost, and fieldThree has a boost of 0.4. This indicates that matches in fieldOne are much more significant than matches in fieldTwo, which are more significant than matches in fieldThree.
bq (boost query)
Defines a raw query string (in the SolrQuerySyntax) that are included with the search query to influence the score. If this is a BooleanQuery with a default boost (1.0f), then the individual clauses are added directly to the main query. Otherwise, the query is included as-is. Any index column, including index columns from an extension index, can be used in a boost query expression. However, because boost queries are handled the same way as a normal query, the same restriction applies, where there is a performance degradation when performing a cross-index query condition, which is typically not recommended. This parameter can be specified multiple times to indicate multiple boost queries.
bf (additive boost function)
Defines a function (with optional boosts) that can be included in the search query to influence its score. Any function that is natively supported by Solr can be used, along with a boost value. This parameter is equivalent to using the _val_:"...function..." syntax in a bq parameter. This parameter can be specified multiple times to indicate multiple additive boost functions.
boost (multiplicative boost function)
This parameter has the same syntax as bf, with the exception that the boost factor specified is multiplied into the score. This parameter can be specified multiple times to indicate multiple multiplicative boost functions.

Performance considerations

Consider the following usage when an extension index such as Inventory exists in WebSphere Commerce search:
  • The filterCache and documentCache are required on the product index when an extension index such as Inventory exists in WebSphere Commerce Search, so that the query component functions correctly.
  • You should typically disable all other internal Solr caches for the extension index in the search run time.