Introduced in Feature Pack 2

WebSphere Commerce search index schema definition

The WebSphere Commerce search index schema definition contains information for processing data in WebSphere Commerce.

For more detailed information, check the solrhome/MC_masterCatalogId/locale/CatalogEntry/conf/schema.xml file for the search index schema definition that matches your version of WebSphere Commerce search.

Feature Pack 6 or later

Catalog Entry extended sites override

When an extended site overrides any of the following catalog entry properties: Name, short description, and keywords, both the asset store, and the extended site properties are being indexed. If a hit matches either the asset store, or the extended site store, the product will be returned and will be updated by the runtime with the extended site override display text.

Feature Pack 6 or later

Category Name

The category name is being indexed under the CatalogEntry index. This is a searchable field, which is used to enhance the relevancy of the product being returned. Runtime components will boost this field. For more information, see Search relevancy and merchandising.

Master catalogs and multiple languages

A WebSphere Commerce instance might have multiple master catalogs, with each master catalog belonging to one store, or in extended sites, might belong to a cluster of stores. The master catalog is not reflected in the index schema. That is, it is reflected at the level of index cores. In addition, each master catalog contains its supported languages through its attached stores. A separate index core is created for each language and master catalog.

The main benefits of maintaining separate index cores for each language are:
Clean schema design
The text search schema is language dependent. For example, different languages contain different analyzers, where some languages contain spell checking or stemming, while others do not.
Runtime query performance
Shoppers typically only search using one language. The query then only runs against one index.
Easier maintenance for language dictionaries
Each language contains different synonyms, stop words, and rules.

Where the WebSphere Commerce instance contains two master catalogs, with each master catalog possibly containing different languages. Separate catentry index cores are created in each language, under each master catalog. When new WebSphere Commerce objects are indexed, new cores are created using the same structure. Each index then provides its own set of search functionality in the storefront, such as its own automatic keyword suggestions, spelling corrections, and dictionaries.

WebSphere Commerce store models and catentry types

Search indexing supports various store models by indexing the top level Catentry store_id. This information is stored in the STORECENT table. In extended site, the store path information is built in query conditions, including both the shopper's extended site store_id, and parent asset store_id.

WebSphere Commerce catentry types such as Product, Item, Bundle, Package and DynamicKit are supported by default for WebSphere Commerce search indexing. They are located in the same index, with their data extracted separately from the WebSphere Commerce database. For example, the query to extract attribute values for products differs for items. The CATENTTYPE_ID is indexed as a field in the search index to distinguish each catentry's type.

Prices in WebSphere Commerce search

WebSphere Commerce search indexes the default offer price. The default offer price is typically considered as the offer price under the catentry's owning store's master catalog's default trading position container. The default offer price can also be complex by considering the following variables:
The store's default currency
Each store contains only one default currency. This default currency price is indexed as a searchable field. The default index contains USD as the default currency, following the WebSphere Commerce sample stores. If your store's default currency is different, you can index it by adding new searchable fields in the index schema and adding new SQL in the Data Import Handler (DIH).
Multiple stores, each with a different default currency
This requires separate index fields for each currency, For example, price_USD, price_CAD, price_EUR, price_CNY, and so on. The following currency codes contain a predefined default index field:
  • USD
  • CAD
  • EUR
  • CNY
  • TWD
  • JPY
  • GBP
  • KRW
  • BRL
  • PLN
  • RON
  • RUB
  • EGP
  • Feature Pack 6 or laterILS
  • Feature Pack 7 or laterTRY
Where the default indexing indexes the price for different currencies, if there is a value for the currency in the OFFERPRICE table.
Stores with multiple currencies (currency conversion)
There are two different approaches in WebSphere Commerce to support multiple currencies and currency conversion. The first approach is populating the OFFERPRICE table with the price in each currency. The second approach is calculating the currency exchange based on the default currency price during runtime, with the exchange rate existing in the CURCONVERT table. Since the WebSphere Commerce sample stores use the second approach and only contain the price for USD, the default indexing only contains the price_USD field as populated.
Miscellaneous attributes of offer
The effective date (OFFER.STARTDATE and OFFER.ENDDATE) is not considered by the default index. Instead, the offer precedence is considered in indexing, with only the offer price with the highest OFFER.PRECEDENCE indexed. The offer quantity is also considered in indexing, where only the offer with (OFFER.MINIMUMQUANTITY IN (1, 0) OR OFFER.MINIMUMQUANTITY IS NULL) is indexed.
Price override in extended site stores
Not indexed by default.
Price rule
Not indexed by default.
Feature Pack 7 or laterCatalog Entry List price
Feature Pack 7 or laterIndexed by default.

Contract entitlement

The runtime contract entitlement filtering is performed through the product set. Based on the shopper's eligible contract in the session, where the contracts are represented as a list of included and excluded product sets, the PRSETCEREL table is checked against each catentry to ensure the shopper is eligible to access the catentry. Therefore, the PRSETCEREL.PRODUCTSET_ID is indexed and queried with the shopper's list of included and excluded product sets. When the includeEntireCatalog flag is true the entitlement logic ignores productset checking. That is, the entire catalog is selected and productset_id is not evaluated against the search index.

Spell check

A dedicated indexed multivalue field is being used to maintain the source and scope of the spell check suggestions. The content of the spellCheck field is populated by copying other column's content. The following are all the fields copied into the spellCheck field:
  • name
  • shortDescription
  • keyword
  • nameOverride
  • shortDescriptionOverride
  • keywordOverride

The spell check index field is the source of the keyword suggestions under the Keyword auto-suggest section, and also the source of the suggestions of misspelled keywords in the storefront.

Parent catalog groups and catentries

There are two parent catalog group fields in the index:
  • parentCatgroup_id_search
  • parentCatgroup_id_facet
The parentCatgroup_id_search field is used for search filtering. For example, searching for a catentry under a specific parent catalog group that can be in master catalog or sales catalog, where the parent catalog group is not necessarily the direct parent of the catentries. To improve the runtime search performance, parentCatgroup_id_search includes both the catentry's direct parent group and all its parents in the catalog hierarchical graph. In addition, since a catentry can contain different parent groups under different catalogs, the parent catgroup must be prefixed by the catalog_id: CatalogId_ParentCatgroupId. This multivalued field improves the search performance, but reduces the catgroup faceting performance. This is because the search engine calculates the faceting count for every parent catgroup, even if most of the calculation results are not useful for faceting.

Therefore, the parentCatgroup_id_facet field is used for search faceting, and only indexes the catentry's direct parent catgroups in one specific catalog. That is, catgroup faceting only needs the faceting numbers on leaf catgroups to build the navigation tree. The format of the field is parentCatgroupId.

For parent catentries in WebSphere Commerce, an item can belong to one product, many packages, many bundles and many kits. These parent catentry's IDs are indexes in a field in the index, so that the parent catentry can be easily identified.

Attribute Dictionary

Attributes in the Attribute Dictionary are indexed as catentry properties. Not only are the attributes indexed, but business users can dynamically add attributes from the index.

The mapping is stored in the ATTRDICTSRCHCONF table. ADS_Fn columns are for string type of attributes, ADI_Fn columns are for integer type of attributes, and ADF_Fn columns are for float type of attributes.

Feature Pack 4Feature Pack 2Feature Pack 3Ensure that you are aware of the following limitations:
  • As you can not dynamically add or remove fields without parsing and changing the SQLs in the DIH, a predefined maximum number of dynamic attributes is used
  • That is, a list of predefined columns (ADS_F1, ...., ADS_F30, ADI_F1, ..., ADI_F10, ADF_F1, ..., ADF_F10) is defined in the index pre-process temporary tables. There are 30 predefined columns by default for string type of the attributes, 10 for integer, and 10 for float for each master catalog. When a business user decides the set of attributes to be indexed, the Management Center is used to set the searchable flag for the list of attributes. The mapping of the attribute identifier, that is, the attr_id column in the ATTR table, and the predefined columns is then created. The mapping between the table columns and index fields is located in the DIH SQL.
  • When the selected attributes are below the maximum number of the predefined columns, new attributes can be added to the index by business users. Once the number reaches the maximum, the preprocess and DIH must be customized to allocate more predefined columns. For example, adding twenty more columns enables 20 more attributes to be dynamically added by business users.
.

Feature Pack 5 or laterThe preceding limitations do not apply in WebSphere Commerce Version 7 Feature Pack 5 and later.

Feature Pack 4Feature Pack 2Feature Pack 5Feature Pack 3Feature Pack 6

Attributes

Attribute data is stored in the ATTRIBUTE and ATTRVALUE tables. The attributes are scoped under each individual catentry. However, this can cause indexing problems because of the lack of consistency across catentries. For example, when a global attribute such as color is created for multiple catentries, a different attribute_id is assigned for each product. When indexing, attribute data is extracted by the ttribute's name. However, the attribute's name can be different across catentries, such as differing spelling or casing of words. Since the name is NL sensitive, it is difficult to keep the attribute consistent across different languages. Therefore, it is recommended to clean up the attribute data before building the search index. Following the design pattern of attribute dictionary indexing, a mapping table CLSATTRSRCHCONF enables you to define the mapping between the attribute's name and predefined search fields.
Note: You must populate the mapping information manually.
There are 30 predefined columns per master catalog and per language by default for string type of attributes (CAS_F1, ..., CAS_F30), 10 columns for integer type of attributes (CAI_F1, ..., CAI_F10), and 10 columns for float type of attributes (CAF_F1, ..., CAF_F10). When a business user decides the set of attributes to be indexed, the business user must work with the IT administrator to populate the corresponding mapping information into the CLSATTRSRCHCONF table. The following factors should be considered:
  • The same attribute for different languages must map to the same column. For example, if 'Color' (English, language id -1) is mapped to CAS_F1, 'Couleur' (French, language id -2) must also be mapped to the CAS_F1 column.
  • When COLOR and Color are the same, the two attribute names must be mapped to the same column.

When the selected attributes are below the maximum number of the predefined columns, new attributes can be added to the index by business users. Once the number reaches the maximum, the preprocess and DIH must be customized to allocate more predefined columns. For example, adding twenty more columns enables 20 more attributes to be dynamically added by business users.

Feature Pack 7 or later

Attribute Dictionary attributes

Some of the catalog entry's associated Attribute Dictionary Attributes properties are being index under the multivalue ad_attribute catalogEntry index field. The properties are tokenized into a string, separated by a delimiter.

The following is an example of the raw Attribute Dictionary attributes data as it appears in the catalog entry index:

<arr name="ad_attribute">
  <str>7000000000000000003/_/construction/_/Construction/_/1.00000/_/7000000000000000025/_/Knit/_/Knit/_/19.00000/_/1/_/0/_/1/_/0/_/0/_/2/_/0</str> 
  <str>7000000000000000004/_/length/_/Length/_/1.00000/_/7000000000000000027/_/Short/_/Short/_/20.00000/_/1/_/0/_/1/_/0/_/0/_/2/_/0</str> 
  <str>7000000000000000005/_/material/_/Material/_/1.00000/_/7000000000000000030/_/Synthetic/_/Synthetic/_/21.00000/_/1/_/0/_/1/_/0/_/0/_/2/_/0</str> 
  <str>7000000000000000002/_/swatchcolor/_/Color/_/1.00000/_/0/_/0/_/0/_/.00000/_/1/_/1/_/1/_/0/_/1/_/1/_/0</str> 
  <str>7000000000000000006/_/occasion/_/Occasion/_/1.00000/_/7000000000000000037/_/Cocktail & Evening/_/Cocktail & Evening/_/25.00000/_/1/_/0/_/1/_/0/_/0/_/2/_/0</str> 
  <str>7000000000000000001/_/swatchSize/_/Available Sizes/_/2.00000/_/0/_/0/_/0/_/.00000/_/1/_/1/_/1/_/0/_/1/_/1/_/0</str> 
</arr>
The following properties are being indexed and mapped to the following internal index field name:
Attributes properties
Index Database source table/column name Internal field name as defined in the wc-component.xml Description
0 CATENTRYATTR.ATTR_ID attr_id The attribute unique ID.
1 ATTR.IDENTIFIER identifier The attribute identifier.
2 ATTRDESC.NAME name The attribute display name.
3 CATENTRYATTR.SEQUENCE sequence The attribute sequence with respect to its product.
4 CATENTRYATTR.ATTRVAL_ID attrval_id The attribute value unique ID.
5 ATTRVAL.IDENTIFIER identifier The attribute value identifier.
6 ATTRVALDESC.STRINGVALUE,ATTRVALDESC.VALUE value The actual attribute value display value (if string type, uses the STRINGVALUE field; otherwise, uses the VALUE field.
7 ATTRVALDESC.SEQUENCE sequence The attribute value sequence with respect to its attribute.
8 ATTR.DISPLAYABLE displayable The attribute is marked searchable.
9 ATTR.SEARCHABLE searchable The attribute is marked displayable.
10 ATTR.COMPARABLE comparable The attribute is marked comparable.
11 ATTR.STOREDISPLAY storeDisplay The attribute is being used for ribbon ad.
12 ATTR.FACETABLE facetable The attribute is marked facetable.
13 CATENTRYATTR.USAGE usage The attribute is assigned to a product as either descriptive or defining attribute.
14 ATTRVALDESC.IMAGE1 image1 The attribute value image 1.
Feature Pack 6 or later

Custom fields in the catalog entry schema

The following search fields are defined in the catalog entry schema.xml file:

<field name="inventory" type="tint" indexed="true" stored="false" multiValued="false" />
<field name="customerRanking" type="tfloat" indexed="true" stored="false" multiValued="false" />
<field name="customerRanking_display" type="float" indexed="true" stored="true" multiValued="false" />
<field name="salesByVolume" type="tfloat" indexed="true" stored="false" multiValued="false" />
<field name="salesByQuantity" type="tfloat" indexed="true" stored="false" multiValued="false" />
<field name="bestSeller" type="boolean" indexed="true" stored="false" multiValued="false" />
<field name="profitMargin" type="tfloat" indexed="true" stored="false" multiValued="false" />
<field name="customerViews" type="tint" indexed="true" stored="false" multiValued="false" />
These fields are placeholders that can be used when customizing the search index. For example, when working with WebSphere Commerce search tutorials.
Feature Pack 7 or later

Bundles, Packages, and Dynamic Kits' components

Some of the catalog entry's components properties are indexed under the CatalogEntry component's multivalue index field. The properties are tokenized into a string, separated by a delimiter, and sequenced according to their sequence value in the CATENTREL.SEQUENCE, DKPDCCOMPLIST.SEQUENCE database columns.

The following is an example of the raw components data as it appears in the catalog entry index:

Bundle, and Package component index field:

<arr name="components">
   <str>11412/_/1.00000</str> 
   <str>11420/_/1.00000</str> 
</arr>
Dynamic Kit component index field:

<arr name="components">
   <str>12173/_/1.00000/_/11504/_/11504/_/assetModel001Class</str> 
</arr>
The following dynamic Bundle and Package properties are being indexed and mapped to the following internal index field name:
Bundle and Package properties
Index Database source table/column name Internal field name as defined in the wc-component.xml Description
0 CATENTREL.CATENTRY_ID_CHILD, and DKPDCCOMPLIST.CATENTRY_ID catentry_id The component unique ID.
1 CATENTREL.QUANTITY, DKPDCCOMPLIST.QUANTITY quantity The component quantity.
2 DKPDCCOMPLIST.DKPREDEFCONF_ID preDefConfId The Dynamic kit predefined configuration ID.
3 KPDCCOMPLIST.DKPDCCOMPLIST_ID componentListId The Dynamic kit components ID.
4 DKPDCCOMPLIST.GROUPNAME groupName The Dynamic kit group header name used to group like components for display.
Feature Pack 7 or later

Dynamic Kits indexed properties

The following Dynamic Kit properties are being indexed and mapped to the following internal index field name:
Dynamic Kit properties
Database source table/column name Internal field name as defined in the wc-component.xml Description
DKPREDEFCONF.COMPLETE isDKPreConfigured This is a flag to specify whether the predefined configuration is complete, or if it requires additional input from an external configurator. The default value, 0, indicates that the predefined configuration is not a complete configuration, and therefore cannot be added to a shopping cart without going through a configurator.
CATCONFINF.REFERENCE dkModelReference Reference information that is required by an external product configurator.
CATCONFINF.URL dkURL (if null, the value will be read from the TRANSPORT table by runtime) The URL location of an external product configurator.
CATCONFINF.CONFIGURATION dkDefaultConfiguration The detailed information of this configuration, typically, it might be XML from an external configurator.