di-buildindex utility

The di-buildindex utility is a wrapping utility that updates the information in the Master Index with the Data Import Handler (DIH) service to build the index. The information is updated either partially through delta index updates or completely through full index builds. The DIH uses URLs to call commands for example, http://host:port/solr/MasterCatalog_CatalogEntry_en_US/dataimport?command=full-import . The index building utility uses DIH to connect to the WebSphere Commerce database through a JDBC connection. It crawls the temporary tables that are populated by the preprocess utility, and then populates the Solr index. The wc-data-config.xml configuration file defines the JDBC configuration and SQL crawling statements. The utility reports the status of the indexing progress based on the statusInterval parameter. By default, every 10 seconds the utility prints how many documents are indexed in each index, how long the utility is running, and the current indexing status. After the utility completes, it reports how many documents were successfully indexed in each index, and which index build failed.
Syntax diagram for di-buildindex utility.

Syntax diagram for di-buildindex utility (Q2)

Parameter values

instance_name
The name of the WebSphere Commerce instance with which you are working (for example, demo).
WebSphere Commerce DeveloperThe instance name is optional in WebSphere Commerce Developer.
masterCatalogId
Required: The ID of the master catalog (for example, 10101).
If you do not know the master catalog ID, run the following SQL:
SQL: select * from catalog where IDENTIFIER='STORE_IDENTIFIER'
To find the master catalog ID for an Extended Site store:
  1. Find the store ID:
    select * from storeent where IDENTIFIER='STORE_IDENTIFIER'
    
  2. Use the storeent_id as the store_id in the following SQL to find the catalog asset store ID of this Extended Site store:
    
    select * from storerel where store_id=XXXXXX and streltyp_id=-4 and relatedstore_id not in (XXXXXX)
    
    Where XXXXXX is the storeent_id from the SQL in step 1 when building the search index.
  3. Get the master catalog ID:
    
    select * from storecat where storeent_id=YYYYYY and mastercatalog='1'
    
    Where YYYYYY is the relatedstore_id from step 2 when building the search index.
localename
Optional: The locale to index.
  • All
  • de_DE
  • en_US
  • es_ES
  • fr_FR
  • it_IT
  • ja_JP
  • ko_KR
  • pl_PL
  • pt_BR
  • ro_RO
  • ru_RU
  • zh_CN
  • zh_TW
The default value is All.
indextype
Optional: Indicates the search engine index to set up for a more granular level of indexing.
Valid values:
  • CatalogEntry: Sets up the index for catalog entries in the master catalog.
  • CatalogGroup: Sets up the index for categories in the master catalog.

    If you do not use the indextype parameter, both the CatalogEntry and CatalogGroup indexes are built by default.

indexSubType
Optional: Indicates the search engine index subtypes to set up. If you include multiple values, they must be separated by a comma.
Valid values:
Structured
Sets up the index for structured content.
Unstructured
Sets up the index for unstructured content.
WebContent
Sets up the index for site content.
Inventory
Sets up the index for inventory data.

If you do not use the indexSubType parameter, the Structured, Unstructured, and WebContent index subtypes are built by default. If you set the indextype to be CatalogGroup, you can set the indexSubType to be Structured. You can set any indexSubType value when you set CatalogEntry to be the indextype value.

dbuser
Required for DB2 and Oracle databases:

DB2The name of the user who is connecting to the database.

OracleThe user ID connecting to the database.

Optional for Derby databases.

dbuserpwd
Required for DB2 and Oracle databases: The password for the user who is connecting to the database.

Alternatively, you can use the passwordFile parameter to specify the encrypted password from a file.

Optional for Derby databases.

fullbuild
Optional: A flag that indicates whether it is a full index build. The accepted values are either true or false. The default value is true.
statusInterval
Optional: The interval in milliseconds that the utility uses to check the index building status. The default value is 10,000 milliseconds.
Note: If it takes too long to index many languages, reduce the status interval to a lower value.
WebSphere Commerce DeveloperbasePath
WebSphere Commerce DeveloperOptional: A list of directories which contains the manifest.txt file for web content. For instance, WCDE_installdir\workspace\Stores\WebContent\AuroraESite\StaticContent\en_US is for the AuroraESite starter store in United States English.

In Extended Sites, if different Extended Site stores have their own pages to index, multiple directories are passed into -basePath separated by a comma. In this case, the -storeId must be provided, with the order of the storeIds according to the order of the basePath.

searchuser
Optional: The user name of the search server. This parameter is required if WebSphere Application Security is enabled for WebSphere Commerce Search.
searchuserpwd
Optional: The password of the search server user. This parameter is required if WebSphere Application Security is enabled for WebSphere Commerce Search. Alternatively, you can use the passwordFile parameter to specify the encrypted password from a file.
solrConnTimeout
Optional: The time in milliseconds that the Solr connection stays open before it times out. The default value is 100 milliseconds. Use this parameter if you encounter timeout issues when you run the di-buildindex utility.
soTimeout
Optional: The time in milliseconds that the socket read stays open before it times out. Use this parameter if you encounter timeout issues when you run the di-buildindex utility.
storeId
Optional: Used with basePath, in Extended Sites, if different Extended Site stores have their own pages to index, multiple directories are passed into -basePath separated by a comma. In this case, the -storeId must be provided, with the order of the storeIds according to the order of the basePath.
webcontentDelete
Optional: A flag to indicate whether WebSphere Commerce Search deletes the site content index. For instance, to delete and rebuild the current site content index: Run the di-buildindex utility, specify webcontentDelete= true, then rerun the di-buildindex utility and specify webcontentDelete = false.

The default value is false.

workspace
Optional: The workspace index to build. This value is case-sensitive. If specified, the specified workspace index is built. If not specified, the base schema index is built. The default value is to build the base schema index.
To get the workspace ID, either:
  • Open the workspace in the Workspace Management tool in the Management Center. The workspace code is the workspace ID; or
  • If the workspace has an active task group, run the following SQL query: select * from cmwsschema, where the workspace ID is listed under the workspace column.
OracledbURL
OracleThe database URL the utility uses to connect to the database. If not provided, the utility constructs a database URL based on the default database value.
retries
Optional: Indicates the number of times the utility retries sending a status command check to the Solr server before indicating that the di-buildindex utility has failed. Retries might help resolve minor issues during the index build. For example, to recover from temporary networking errors.
numOfLangsParallel
Optional: Indicates the number of languages to build in parallel. For example, indicating 2 results in the first two languages being built in parallel, followed by the next two when complete, until all languages are built. If not specified, all languages are built at the same time.
passwordFile
Optional: The full path to the password.properties file that contains password values used by this utility. For instance, C:\password.properties.
The password.properties file can contains one of the two following parameters:

dbUserPassword=encrypted_database_pwd
where encrypted_database_pwd is a password that is encrypted with the wcs_encrypt utility (without specifying the Merchant key).
searchAdminPassword=encrypted_search_user_pwd 
where encrypted_search_user_pwd is the search server user password and is encrypted with the wcs_encrypt utility (without specifying the Merchant key).
restartTime
The restart time to begin invalidating the cache. The input pattern is MM/dd/yyyy_HH:mm:ss.
Note: This parameter does not support Derby databases.
force
When set to true, forces the utility to run, even if other processes are in progress. Ensure that this parameter value matches for both the di-preprocess and di-buildindex utilities, otherwise the utility will encounter errors and fail to run.
validateindex
Whether to validate the index on runtime. The default is false.
runCategoryRules
This parameter is used for registering events for rule based sales categories. The default value is true.
taskgroup
The task group identifier. It is mandatory if -runCategoryRules is set to true and the -workspace parameter is specified.
task
The task identifier. It is mandatory if -runCategoryRules is set to true and the -workspace parameter is specified.

Example

From the following directory on your WebSphere Commerce machine:
  • WC_installdir/bin
  • WebSphere Commerce DeveloperWCDE_installdir\bin
Run the following command:
  • Windows
    di-buildindex.bat -instance instance_name -masterCatalogId masterCatalogId [-localename localename]
    [-indextype indextype] [-indexSubType indexSubType]
    [-dbuser dbuser] [-dbuserpwd dbuserpwd] [-fullbuild true | false] [-statusInterval statusInterval] 
    [-storeId storeId] [-webcontentDelete true | false] 
    [-solrConnTimeout solrConnTimeout] [-soTimeout soTimeout] [-workspace workspaceId] [-retries retries] 
    [-passwordFile passwordFile] [-restartTime restartTime]
  • LinuxAIXFor IBM i OS operating system
    di-buildindex.sh -instance instance_name -masterCatalogId masterCatalogId  [-localename localename]
    [-indextype indextype] [-indexSubType indexSubType]
    [-dbuser dbuser] [-dbuserpwd dbuserpwd] [-fullbuild true | false] [-statusInterval statusInterval] 
    [-storeId storeId] [-webcontentDelete true | false] 
    [-solrConnTimeout solrConnTimeout] [-soTimeout soTimeout] [-workspace workspaceId] [-retries retries] 
    [-passwordFile passwordFile] [-restartTime restartTime]
  • WebSphere Commerce Developer
    • DB2Oracledi-buildindex.bat -masterCatalogId masterCatalogId [-indextype indextype]
    • Apache DerbyApache Derby supports only one database connection, you must provide more parameters to build the web content index subtype:
      di-buildindex.bat -masterCatalogId masterCatalogId  [-localename localename]
      [-indextype indextype] [-indexSubType indexSubType] [-webcontentDelete true | false] 
      [-basePath basePath] [-storeId -storeId]
If the utility runs successfully, the following message is displayed in the Command window:
Data import process completed successfully with no errors.
Also, inspect the following file for errors:
  • LinuxAIXWindowsWC_installdir\logs\wc-dataimport-buildindex.log
For more information about exit codes, see WebSphere Commerce Search utility exit codes.
To get more logging information, update the logging level from INFO to FINEST in the WC_installdir/instances/instance_name/xml/config/dataimport/buildindex-logging.properties file:
# Default global logging level, INFO
com.ibm.commerce.level=FINEST