Optimizing Search Results in Hebrew Language Stores

For optimal catalog search results when your catalog includes data in the Hebrew language, generate a catalog-specific dictionary. The default catalog is not optimized for Hebrew-language catalog items. You can use the createTokens.db2.sh script to build a catalog dictionary that is properly optimized for Hebrew storefronts. If you catalog does not include Hebrew data, running this script does not affect the quality of search results.

Before you begin

Use this script at least one time after your initial creation and upload of the catalog to the system.
For optimal results, run the script each time that the catalog undergoes a significant change. For instance, run this script if you add many catalog products, update them or change their descriptions.

Procedure

Save the following shell code into a file with the name createTokens.db2.sh in the /opt/IBM/WebSphere/CommerceServer/bin directory:

#!/bin/sh

#-----------------------------------------------------------------
# Licensed Materials - Property of IBM
#
# WebSphere Commerce
#
# (C) Copyright IBM Corp. 2006, 2016 All Rights Reserved.
#
# US Government Users Restricted Rights - Use, duplication or
# disclosure restricted by GSA ADP Schedule Contract with
# IBM Corp.
#-----------------------------------------------------------------

#
#

# show usage of the command
showusage()
{
echo "Usage:"
echo "------"
echo "createTokens.db2.sh [dbname userId password langid"
echo "where dbname is the name of the database to be populated"
echo "where userId is the userid of the user who owns the database"
echo "where password is the password of the user"
echo "where langid is the Hebrew language id in the database"

exit 1
}

# end with failure
endfailure()
{
echo "Error connecting database.  Check log for details."
exit 1
}

CURDIR=`pwd`
BINDIR=`dirname $0`

# change current directory to bin
cd $BINDIR

# Set up environment variables needed for Oracle DB connection
if [ -f $BINDIR/config_env.db2.sh ]; then
   . $BINDIR/config_env.db2.sh         
else
   . $CURDIR/config_env.db2.sh         
fi

if [ $# -eq 4 ]; then
	
	DATABASE=$1
	USER=$2
	PASSWORD=$3
	LANGID=$4
	LOG=$WCLOGDIR/createTokens.db2.log
	TOKENS=$WCLOGDIR/Tokens.txt
	TOKENSZIP=$WCLOGDIR/Tokens.zip
	
	if [ "$DATABASE" = "" ]; then
	     showusage
	fi
	if [ "$USER" = "" ]; then
 	    showusage
	fi
	if [ "$PASSWORD" = "" ]; then
	     showusage
	fi

	SUBDIR=`dirname $LOG`
	if [ -f $LOG ]; then
		mv $LOG $LOG.orig
	elif [ ! -d $SUBDIR ]; then
		mkdir -p $SUBDIR
		touch $LOG
	else
		touch $LOG
	fi
		
	cat /dev/null > $TOKENS
	rm -f $TOKENSZIP
	db2 -v connect to $DATABASE user $USER using $PASSWORD >> $LOG 2>&1
	if [ $? -ne 0 ]
	then
		endfailure  
	fi
	db2 -v SELECT NAME, SHORTDESCRIPTION, LONGDESCRIPTION FROM CATGRPDESC WHERE LANGUAGE_ID=-11 | while read line
	do
		for word in $line
		do
			echo $word | grep -P "[\x80-\xFF]" >> $TOKENS   
		done
	done
	db2 -v SELECT NAME, SHORTDESCRIPTION, LONGDESCRIPTION FROM CATENTDESC WHERE LANGUAGE_ID=-11 | while read line
	do
		for word in $line
		do
			echo $word | grep -P "[\x80-\xFF]" >> $TOKENS   
		done
	done
	db2 -v SELECT STRINGVALUE FROM ATTRVALDESC WHERE LANGUAGE_ID=-11 | while read line
	do
		for word in $line
		do
			echo $word | grep -P "[\x80-\xFF]" >> $TOKENS   
		done
	done
	zip $TOKENSZIP $TOKENS
       
else
	showusage
fi

Run the script as follows.
```
./createTokens.db2.sh <DBName> <dbUser> <pwd> <langID>
```
Where

DBName

The Commerce database Name. For instance, MALL.

DbUser

The db user name. For instance, dbinst1.

Pwd

The password for this user.

LangID

The Hebrew language ID used by your database. For instance, -11.
Running the script might take 5 - 10 minutes. It produces the file Tokens.zip in the directory /opt/IBM/WebSphere/CommerceServer/logs.
Update the Tokens.zip entry in the file lucene-analyzers-common-xxx.jar, which can be found in org/apache/lucene/analysis/he. The JAR file lucene-analyzers-common-xxx.jar must be updated in the two following locations: Opt/IBM/WebSphere/AppServer/profiles/demo/installedApps/WC_demo_cell/WC_demo.ear/lib/lucene-analyzers-common-xxx.jar and Opt/IBM/WebSphere/AppServer/profiles/demo_solr/installedApps/demo_search_cell/Search_demo.ear/lib/lucene-analyzers-common-xxx.jar.
Rebuild the index with the di-buildindex.bat command. See Building the WebSphere Commerce Search index.

Example

./createTokens.db2.sh MALL dbinst1 guest -11