Creating descriptive storefront URLs when duplicate keywords exist

You can use the Data Load utility, seourlkeywordgen utility, or canonical URLs to create descriptive URLs that avoid unwanted characters in your URLs when duplicate keywords exist.
Catalog structures can have categories or products that share names but are different. For example, if your store provides apparel products, you can have a Shirt category under a Womens category and another Shirt category under a Mens category. Given this structure, you might want to have the following URLs for your categories:
  • http://example.site.com/shop/en/samplestore/womens/shirt
  • http://example.site.com/shop/en/samplestore/mens/shirt
The Search Engine Optimization (SEO) feature includes a seourlkeywordgen utility that helps generate URL keywords for categories or products. These generated URLs are based on the category or product name. The SEO framework, however, does not allow duplicate keyword names. This framework enforces the uniqueness of URL keywords to reduce the negative impact on performance during URL deconstruction. If there are duplicate keywords, the server must perform several lookups to find the matching object. Also, the deconstruction framework must save the parent keyword to properly identify the matching URL, making the algorithm more complex.
This limitation causes the utility to default to generating alternative keywords that are based on a combination of various attributes:
  • For categories, the alternative keyword that is generated is the combination of the category name, category identifier, and language identifier.
  • For products, the alternative keyword that is generated is the combination of the product name, product part number, and language identifier.
Based on this behavior, the two preceding sample URLs become the following URLs:
  • http://example.site.com/shop/en/samplestore/womens/shirt
  • http://example.site.com/shop/en/samplestore/mens/shirt10002-1
In the mens shirt URL, the category identifier 10002 and the language identifier -1 are now appended to the URL. This generated URL might not be sufficient for your business needs because it contains numbers. To avoid these unwanted characters in your URLs, consider using one of the following options to improve creating descriptive URLs for your storefront.

You have several options to resolve the HCL Commerce SEO limitation of having duplicate keywords in your store. By selecting one of these options, you can create more descriptive SEO friendly URLs for your categories and products, which can optimize your page ranking.

Option 1: Use the Data Load utility to override duplicate keywords

The Data Load utility supports loading data in a CSV file with Catalog Import without requiring information included in the file to understand the database schema. If you know which categories or products have the same name, you can use the Data Load utility to load more meaningful but still unique keywords. For example, if you want the Shirts category under the Mens category to have a more meaningful keyword such as shirts-for-him. The resulting generated URLs can resemble the following URLs:
  • http://example.site.com/shop/en/samplestore/womens/shirts
  • http://example.site.com/shop/en/samplestore/mens/shirts-for-him
As you can see from the preceding URLs, the keywords are now distinct and do not contain numbers.

For more information about the Data Load utility, see Overview of the Data Load utility.

Option 2: Configure the seourlkeywordgen utility to use other attributes to resolve keyword conflicts

The SEO-BaseComponentLogic.jar file contains several XML files that are located within the com\ibm\commerce\seo\loader directory. The SEO-BaseComponentLogic.jar file is located within the following directory:
  • WC_profiledir/ts.ear/SEO-BaseComponentLogic.jar
The files that are within this JAR file are used by the seourlkeywordgen utility to generate unique keywords. You can modify these XML files to have the utility use other category or product attributes when it constructs the main URL keyword and alternative URL keyword. The XML files contain parameters and query tags, which you can modify.
For example, the following code snippet is a sample of some of the contents within the com\ibm\commerce\seo\loader\category.xml file:
   <parameter    generatorId="paramShareLanguage"      subClass="EnvParameterGenerator"    seed="shareURLKeywordForAllLanguages"  />
   <parameter    generatorId="paramStoreId"            subClass="EnvParameterGenerator"    seed="storeId"    />
   <parameter    generatorId="paramCatalogId"          subClass="EnvParameterGenerator"    seed="catalogId"  />
1<parameter    generatorId="paramCatGroupKeyword"    subClass="EnvParameterGenerator"    default="NAME"    />
2<parameter    generatorId="paramCatGroupKwd2"       subClass="EnvParameterGenerator"    default="NAME+CATGROUP_ID+LANGUAGE_ID" />
   <parameter    generatorId="paramChangeF"            subClass="ValueGenerator"   seed="N"    isString="true" />
   <parameter    generatorId="paramPriority"           subClass="ValueGenerator"   seed="0" />
   ….
3<query>
     <select>
       CATGROUP.CATGROUP_ID, CATGRPDESC.NAME, CATGRPDESC.LANGUAGE_ID
     </select>
     <from>
       CATGROUP, CATGRPREL, CATTOGRP, CATGRPDESC, STOREENT
     </from>
     …..
   </query>
Parameter Description
1 The paramCatGroupKeyword parameter is the field that is used to generate the SEO keyword for the category. As shown previously, the NAME attribute is used as the keyword.
2 For a duplicate SEO keyword, the paramCatGroupKwd2 parameter is the field that is used to generate the alternative keyword for the category.
3 You can use any columns from the database table that the query is accessing. To use more columns, you can add the column name to the select tag. Likewise, if you want to use columns from a custom table, you must add the custom table name to the from tag.
For example, if you want to use the field1 column from the CATGROUP table as part of the alternative keyword, your category.xml file can contain the following code:
<parameter    generatorId="paramCatGroupKeyword"    subClass="EnvParameterGenerator"    default="NAME" />
<parameter    generatorId="paramCatGroupKwd2"       subClass="EnvParameterGenerator"    default="NAME+FIELD1" />

<query>
  <select>
    CATGROUP.CATGROUP_ID, CATGRPDESC.NAME, CATGRPDESC.LANGUAGE_ID, CATGROUP.FIELD1
  </select>
  <from>
    CATGROUP, CATGRPREL, CATTOGRP, CATGRPDESC, STOREENT
  </from>
….
</query>
Another example is if you want to use the NAME and myField column from a custom table that is called MYTABLE. To use these columns as the first choices for a keyword, your category.xml file could contain the following code:
<parameter    generatorId="paramCatGroupKeyword"    subClass="EnvParameterGenerator"    default="NAME+MYFIELD" />
<parameter    generatorId="paramCatGroupKwd2"       subClass="EnvParameterGenerator"    default="NAME+MYFIELD+CATGROUP_ID" />

<query>
  <select>
    CATGROUP.CATGROUP_ID, CATGRPDESC.NAME, CATGRPDESC.LANGUAGE_ID, MYTABLE.MYFIELD
  </select>
  <from>
    CATGROUP, CATGRPREL, CATTOGRP, CATGRPDESC, STOREENT, MYTABLE
  </from>
….
</query>
After you change your XML files, you must rerun the seourlkeywordgen utility for these new rules to take effect.

Option 3: Use canonical URLs instead of hierarchical URLs

You can also consider using the canonical URL pattern instead of the hierarchical URL pattern for your storefront URLs.
Hierarchical URL examples with descriptive keywords:
  • http://example.site.com/shop/en/samplestore/womens/womens-shirt
  • http://example.site.com/shop/en/samplestore/mens/mens-shirt
Canonical URL examples:
  • http://example.site.com/shop/en/samplestore/womens-shirt
  • http://example.site.com/shop/en/samplestore/mens-shirt
Note: There is no tooling available to help you generate your descriptive keywords when you use this option. You must change your keywords manually within Management Center or use the Data Load utility to load new keywords.
The HCL Commerce starter store JSP files use the hierarchical URL pattern to construct the breadcrumb trail and all category links in the header. If you want to use canonical URL patterns, your URLs do not contain the catalog hierarchy. To use the canonical URL pattern, you must change your store JSP code to use the canonical URL pattern name. This name is used to construct the storefront URLs for your categories and products:
  • For categories, you must change the JSP that constructs the category URLs to use the CanonicalCategoryURL pattern
  • For products, you must change the JSP that constructs the product URLs to use the ProductURL pattern
The canonical URL patterns are defined in the workspace_dir/crs-web/WebContent/WEB-INF/xml/seo/stores/store_name/SEOURLPatterns-<objectname>.xml files:
  • SEOURLPatterns.xml
  • SEOURLPatterns-Category.xml
  • SEOURLPatterns-Content.xml
  • SEOURLPatterns-Product.xml
  • SEOURLPatterns-Search.xml
Tip: The URLs constructed for categories in the Aurora starter store SiteMap.jsp file use the CanonicalCategoryURL pattern. The URLs constructed for the items added to the mini shop cart in the Aurora starter store MiniShopCartDisplay_data.jsp use the ProductURL pattern. You can refer to these JSP files as references.