Creating and indexing shards

This task sets up and indexes a specified number of sharding cores by using the search index setup utility (setupSearchIndex), an input properties file, and the shard indexing utility (di-parallel-process).

Before you begin

  1. Decide on the sharding configuration you want to use. For example, either only horizontal shard types, or a combination of horizontal and vertical shard types.
  2. Decide on the number of shards to create. An extra vertical shard is created by the setupSearchIndex utility when the configParallelShards action is used.
  3. Run the setupSearchIndex utility once using only the default parameters instance, masterCatalogId, dbuser and dbuserpwd. For more information about the command and its default parameters, see Step 3 in the Procedure section of this topic.

About this task

During this task, you perform the following high-level steps:
  1. Set up the sharding cores by running the setupSearchIndex utility and specifying the shard-specific actions and parameters.
  2. Create and populate the input properties file that is used to process indexing shards.
  3. Run the di-parallel-process utility to index the shards.

Procedure

Set up the sharding cores
  1. Complete one of the following tasks.
    • LinuxAIXLog on as a WebSphere Commerce non-root user.
    • For IBM i OS operating systemLog on with a user profile that has *SECOFR authority.
    • WindowsLog on with a user ID that is a member of the Windows Administration group.
  2. Go to the following directory:
    • WC_installdir/components/foundation/subcomponents/search/bin
    • WebSphere Commerce DeveloperDB2WCDE_installdir\components\foundation\subcomponents\search\bin
  3. Run the search index setup utility:
    • WebSphere Commerce DeveloperDB2setupSearchIndex.bat -masterCatalogId masterCatalogId -indextype CatalogEntry -numOfShards numOfShards -action action -dbuser db_user -dbuserpwd db_password [-shardTags -shardTags]
    • Windows setupSearchIndex.bat -instance instance_name -masterCatalogId masterCatalogId -indextype CatalogEntry -numOfShards numOfShards -action action -dbuser db_user -dbuserpwd db_password [-shardTags '-shardTags']
    • LinuxAIXFor IBM i OS operating systemsetupSearchIndex.sh -instance instance_name -masterCatalogId masterCatalogId -indextype CatalogEntry -numOfShards numOfShards -action action -dbuser db_user -dbuserpwd db_password [-shardTags -shardTags]
    Where
    action
    Prepares the utility for certain actions, such as configuring the shards.
    configHorizontalShards
    Configures indexing for new horizontal shards.
    configHorizontalShardsUpdate
    Configures indexing for existing horizontal shards.
    configParallelShards
    Configures indexing for new parallel shards, which creates horizontal shards, plus one vertical shard.
    configParallelShardsUpdate
    Configures indexing for existing parallel shards.
    configShardsReset
    Deletes all existing shards.
    masterCatalogId
    The ID of the master catalog (for example, 10101).
    If you do not know the master catalog ID, run the following SQL for your starter store:
    SQL: select * from catalog where IDENTIFIER='STORE_IDENTIFIER'
    
    To find the master catalog ID for an Extended Site store:
    1. Find the store ID:
      select * from storeent where IDENTIFIER='STORE_IDENTIFIER'
      
    2. Use the storeent_id as the store_id in the following SQL to find the Catalog Asset store ID of this Extended Site store:
      
      select * from storerel where store_id=XXXXXX and streltyp_id=-4 and relatedstore_id not in (XXXXXX)
      
      Where XXXXXX is the storeent_id.
    3. Get the master catalog ID:
      
      select * from storecat where storeent_id=YYYYYY and mastercatalog='1'
      
      Where YYYYYY is the relatedstore_id.
    numOfShards
    The number of sharding cores you want to prepare.
    dbuser
    The name of the user that is connecting to the database.
    dbuserpwd
    The password for the user that is connecting to the database.
    shardTags
    Optional: Creates the specified shard tags.
    For example, passing in -shardTags X,Y,Z creates the following shards: Shard-X, Shard-Y, and Shard-Z.
    Note: The specified number of shard tags must match the specified number of shards. If not specified, a letter is used from A to Z. Therefore, numOfShards is up to a maximum of 26.
  4. Verify that the preprocessing shards were created successfully according to the number of shards that are specified in the utility, under the following directory:
    • WC_installdir/instances/instance_name/search/pre-processConfig/MC_masterCatalogId/db_type/Shards/Shard-#Shard-Tag#
    Note: For a vertical configuration, the number of preprocessing shards that exists is the number of shards that are specified in the utility, plus one (numOfShards+1).
  5. Verify that the indexing shards were created successfully according to the number of shards that are specified in the utility, under the following directory:
    • WC_installdir/instances/instance_name/search/solr/home/MC_masterCatalogId/Shards/locale
  6. Restart the WebSphere Commerce Search server.
  7. If you are using indexed contract prices, complete the following step:
    1. Open the following file for all shards: WC_installdir/instances/instance_name/search/solr/home/MC_masterCatalogId/Shards/locale/CatalogEntry_#Shard-Tag#/conf/wc-data-config.xml.
    2. Find all instances of the following table name TI_CNTRPRICE_0_#Shard-Tag#.
    3. Replace them with the following table name: TI_CNTRPRICE_0.

      This update enables the utilities to look for contract prices in the TI_CNTRPRICE_0 table, instead of the shard-specific tables.

    4. Save your changes.
    5. Run the di-calculateprice utility. For specific usage, see Adding contract prices to the Catalog Entry index.
Prepare the input properties file
  1. Create an input properties file to be used for indexing, based on the output of the setup utility.
    The sharding input properties file is used by the di-parallel-process utility to process indexing shards. It contains the following sections of properties:
    • System properties, which are shared by other utilities, such as passwords that are common among utilities.
    • Database properties, which are used to establish database connections with the database server.
    • Global preprocessing and indexing properties, which are used for preprocessing and indexing by all shards.
    • Master search server properties, which specify the master index cores where all shard data is merged.
    • Horizontal Shard properties, which specify the horizontal shard properties.
    • Vertical Shard properties, which specify the vertical shard properties.

    Use the following sample file for reference: di-parallel-process.zip../code/di-parallel-process_CoC.zip

    The sample uses the en_US locale with shards A, B, and C. It contains the following files:
    di-parallel-process-FEP8-linux-oracle.properties
    The sample sharding input properties file for a Linux operating system that runs an Oracle database.
    password.properties
    The sample password properties file, referenced by the sample sharding input properties file. It contains passwords encrypted by the wcs_encrypt utility.

    For more information about the properties file and expected values, see Sharding input properties file.

Run the shard indexing utility
  1. Go to the following directory:
    • WC_installdir/bin
    • WebSphere Commerce DeveloperDB2WCDE_installdir\bin
  2. Run the utility:
    • WebSphere Commerce DeveloperDB2di-parallel-process.bat input_properties_file
    • Windows di-parallel-process.bat input_properties_file -instance instance_name [-dbuser dbuser] [-dbuserpwd dbuserpwd] [-searchuser searchuser] [-searchuserpwd searchuserpwd]
    • LinuxAIXFor IBM i OS operating systemdi-parallel-process.sh input_properties_file -instance instance_name [-dbuser dbuser] [-dbuserpwd dbuserpwd] [-searchuser searchuser] [-searchuserpwd searchuserpwd]
    Where:
    input_properties_file
    The relative path of the input properties file to pass into the utility.
    For example, ../../di-parallel-process-linux-oracle.properties
    instance_name
    The name of the WebSphere Commerce instance with which you are working (for example, demo).
    dbuser
    Optional: The name of the user that is connecting to the database.
    dbuserpwd
    Optional: The password for the user that is connecting to the database.
    searchuser
    Optional: The search application administrative user name.
    searchuserpwd
    Optional: The search application administrative user password.
  3. Ensure that the utility runs successfully. Either check the exit code, or the wc-dataimport-parallel-processor.log file for more information.

What to do next

  1. Replicate the merged index into the repeater, then to the other nodes.