Introduced in Feature Pack 2

Configuring the search preprocessor

Introduced in Feature Pack 2 In this lesson, the preprocessor configuration file is modified for custom data. The preprocessing tasks are controlled by the wc-dataimport-preprocess XML files. The files contain table definitions, database schema metadata, and references to the Java classes used in the preprocessing steps. In this task, you add a custom configuration file to reference the processing Java classes you create and temporary ranking table information.

About this task

The ranking data is preprocessed in two steps:
  1. Loading the data into a temporary table
  2. Resolving internal WebSphere Commerce referential constraints, and loading the resolved data into a secondary table. The secondary table data is used for indexing purposes.
This data is processed in two steps since the external data does not contain references to internal identifiers. For example, CATENTRY.CATENTRY_ID and CATENTRY.MEMBER_ID. In this tutorial, the ranking data refers to CATENTRY.PARTNUMBER. The CATENTRY_ID is resolved from the PARTNUMBER and MEMBER_ID values. The MEMBER_ID used for this tutorial is set in the code snippet provided.

Procedure

  1. Go to the following directory:
    • WebSphere Commerce DeveloperWCDE_installdir/search/pre-processConfig/MC_masterCatalogId/development_db
    • SolarisLinuxAIXWindowsWC_installdir/instances/instance_name/search/pre-processConfig/MC_masterCatalogId/target_db
    For Example:
    • WebSphere Commerce DeveloperWCDE_installdir/search/pre-processConfig/MC_10001/DB2
    • SolarisLinuxAIXWindowsWC_installdir/instances/instance_name/search/pre-processConfig/MC_10001/DB2
  2. For the purposes of this tutorial, obtain a copy of Ratings.xml and place it in your C:\IBM\WCDE_ENT70\bin\ folder. This XML file contains your sample ratings data that is imported into your store database.
  3. Create a custom preprocess configuration file and call it wc-dataimport-preprocess-custom.xml:
    
    <?xml version="1.0" encoding="UTF-8"?>
    
    <_config:DIHPreProcessConfig xmlns:_config="http://www.ibm.com/xmlns/prod/commerce/foundation/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com/xmlns/prod/commerce/foundation/config ../../xsd/wc-dataimport-preprocess.xsd ">
     
     <!-- load ratings into temp table -->
      <_config:data-processing-config processor="com.mycompany.commerce.preprocess.StaticRatingsDataPreProcessor" batchSize="500">
        <_config:table definition="CREATE TABLE TI_RATING_TEMP ( PART_NUMBER VARCHAR(256),RTYPE VARCHAR(256), RATING VARCHAR(256))" name="TI_RATING_TEMP"/>
        
       <_config:query sql=""/>
        <_config:mapping>
          <_config:key queryColumn="CATENTRY_ID" tableColumn="CATENTRY_ID"/>
          <_config:column-mapping>
            <_config:column-column-mapping>
               <_config:column-column queryColumn="" tableColumn="" />
            </_config:column-column-mapping>
            </_config:column-mapping>
        </_config:mapping>    
        
        <!-- this property is added new to locate the input file path instead of hard coding it to be in WC\bin -->
    <_config:property name="inputFile" value="C:\IBM\WCDE_ENT70\bin\Ratings.xml"/>
    
          
       
      </_config:data-processing-config>
      
      
      <_config:data-processing-config processor="com.mycompany.commerce.preprocess.StaticRatingsDataPopulator" batchSize="500">
        <_config:table definition="CREATE TABLE TI_RATING ( CATENTRY_ID BIGINT NOT NULL, PART_NUMBER VARCHAR(256),RTYPE VARCHAR(256), RATING VARCHAR(256))" name="TI_RATING"/>
        
       <_config:query sql="insert into TI_RATING ( catentry_id,part_number, rating,rtype ) select catentry_id,part_number,rating,rtype from catentry,ti_rating_temp where catentry.partnumber=ti_rating_temp.part_number and catentry.member_id=7000000000000000002"/>
        <_config:mapping>
          <_config:key queryColumn="CATENTRY_ID" tableColumn="CATENTRY_ID"/>
          <_config:column-mapping>
            <_config:column-column-mapping>
               <_config:column-column queryColumn="" tableColumn="" />
            </_config:column-column-mapping>
            </_config:column-mapping>
        </_config:mapping>    
          
       
      </_config:data-processing-config>
      
      
    </_config:DIHPreProcessConfig>
    
    
    Notes: Ensure that the MEMBER_ID and inputFile file path values are correct for your store.
    • The first <_config:data-processing-config> refers to com.mycompany.commerce.preprocess.StaticRatingsDataPreProcessor, the Java class for loading the data by using the processor attribute. This element defines the table definition for the first temporary table, TI_RATING_TEMP, by using the <_config:table> subelement. The remaining subelements remain unused and ensure that the XML is well-formed.
    • The second <_config:data-processing-config> refers to com.mycompany.commerce.preprocess.StaticRatingsDataPopulator, the Java class responsible for reading the data that is produced by the first stage of the preprocessing and resolve the internal identifiers. This element defines the table definition for the secondary temporary table, TI_RATING, which stores the resolved data. The <_config:query> element defines the SQL used to resolve and load the data. The MEMBER_ID value is set to "7000000000000000002" in the code for use in this tutorial.
    • The inputFile property is used to point to a file that contains the sample rating data. For example, <_config:property name="inputFile" value="C:\IBM\WCDE_ENT70\bin\Ratings.xml"/>.
  4. Save your changes and close the file.
  5. Create the following Java classes in the WebSphereCommerceServerExtensionsLogic package. You can create these files manually with the following source code, or import them into your project by downloading src.zip, previously provided in the introduction to the tutorial.
    StaticRatingsDataPreProcessor
    
    package com.mycompany.commerce.preprocess;
    
    import java.sql.Connection;
    import java.sql.PreparedStatement;
    import java.sql.SQLException;
    import java.util.ArrayList;
    import java.util.logging.Level;
    
    import com.ibm.commerce.foundation.common.util.logging.LoggingHelper;
    import com.ibm.commerce.foundation.dataimport.exception.DataImportApplicationException;
    import com.ibm.commerce.foundation.dataimport.preprocess.AbstractDataPreProcessòr;
    import com.ibm.commerce.foundation.dataimport.preprocess.DataPreProcessor;
    import com.ibm.commerce.foundation.dataimport.preprocess.config.DataImportPreProcessConfig.DataProcessingConfig;
    import com.mycompany.commerce.preprocess.rating.ProductRating;
    import com.mycompany.commerce.preprocess.rating.Rating;
    import com.mycompany.commerce.preprocess.rating.RatingXMLReader;
    import com.mycompany.commerce.preprocess.rating.XMLReaderException;
    
    public class StaticRatingsDataPreProcessor extends AbstractDataPreProcessor
          implements DataPreProcessor {
    
       private final static String CLASSNAME = StaticRatingsDataPreProcessor.class
             .getName();
    
       private final static java.util.logging.Logger LOGGER = LoggingHelper
             .getLogger(StaticRatingsDataPreProcessor.class);
    
       @Override
       public void process(DataProcessingConfig dataProcessingConfig,
             Connection connection, boolean fullBuild, String localeName,
             boolean bJ2SE) throws SQLException, DataImportApplicationException {
          final String METHODNAME = "process(DataProcessingConfig, Connection, boolean)";
          LOGGER.entering( CLASSNAME, METHODNAME,
                new Object[] { Boolean.toString(fullBuild) });
    
          String tableName = dataProcessingConfig.getTableName();
          String tableDefinition = dataProcessingConfig.getTableDefinition();
          String filename = (String)dataProcessingConfig.getPropertyMap().get("inputFile");
    
          createDBTable(connection, tableName, tableDefinition, bJ2SE);
    
          RatingXMLReader xmlReader = new RatingXMLReader();
          ArrayList<ProductRating> pRatings = null;
          try {
             pRatings = xmlReader.constructProductRatingsFromXML(filename);
          } catch (XMLReaderException e) {
    
             
             LOGGER.log(Level.SEVERE, e.getMessage());
          }
    
          String insertSQL = "insert into " + tableName
                + "(PART_NUMBER, RTYPE, RATING) values (?, ? , ?)";
          try{
          PreparedStatement insertStmt = connection.prepareStatement(insertSQL);
          LOGGER.log(Level.FINER, "Hello");
          
          if(pRatings!=null){
             
             
    
          for (int i = 0; i < pRatings.size(); i++) {
             ProductRating aProdRating = pRatings.get(i);
    
             if (aProdRating != null) {
                String partNumber = aProdRating.getPartNumber();
    
                Rating aRating = aProdRating.getRating();
    
                if (aRating != null) {
                   String avgRating = aRating.getAvgRating();
                   String ratingType = aRating.getRatingType();
    
                   if (avgRating != null && ratingType != null
                         && partNumber != null) {
    
                      insertStmt.setString(1, partNumber);
                      insertStmt.setString(2, ratingType);
                      insertStmt.setString(3, avgRating);
    
                      insertStmt.executeUpdate();
                      
                      LOGGER.log(Level.FINER,"EXECUTING update >>>>>>>>>>>>>");
                   }
                }
    
             }
    
          }
          }
          }catch (Exception e){
             LOGGER.log(Level.FINER, e.getMessage());
             
          }
       }
    
    }
    
    StaticRatingsDataPopulator
    
    package com.mycompany.commerce.preprocess;
    
    import java.sql.Connection;
    import java.sql.PreparedStatement;
    import java.sql.SQLException;
    import java.util.logging.Level;
    
    import com.ibm.commerce.foundation.common.util.logging.LoggingHelper;
    import com.ibm.commerce.foundation.dataimport.exception.DataImportApplicationException;
    import com.ibm.commerce.foundation.dataimport.preprocess.AbstractDataPreProcessor;
    import com.ibm.commerce.foundation.dataimport.preprocess.DataPreProcessor;
    import com.ibm.commerce.foundation.dataimport.preprocess.config.DataImportPreProcessConfig.DataProcessingConfig;
    
    public class StaticRatingsDataPopulator extends AbstractDataPreProcessor
          implements DataPreProcessor {
    
       private final static String CLASSNAME = StaticRatingsDataPopulator.class
             .getName();
    
       private final static java.util.logging.Logger LOGGER = LoggingHelper
             .getLogger(StaticRatingsDataPopulator.class);
    
       @Override
       public void process(DataProcessingConfig dataProcessingConfig,
             Connection connection, boolean fullBuild, String localeName,
             boolean bJ2SE) throws SQLException, DataImportApplicationException {
          
          LOGGER.log(Level.INFO,"EXECUTING TI_RATING insert");
    
          final String METHODNAME = "process(DataProcessingConfig, Connection, boolean)";
          LOGGER.entering(CLASSNAME, METHODNAME, new Object[] { Boolean
                .toString(fullBuild) });
    
          String tableName = dataProcessingConfig.getTableName();
          String tableDefinition = dataProcessingConfig.getTableDefinition();
    
          createDBTable(connection, tableName, tableDefinition, bJ2SE);
    
          String insertSQL = dataProcessingConfig.getSQL().trim();
            
                
          try {
             PreparedStatement insertStmt = connection
                   .prepareStatement(insertSQL);
             LOGGER.log(Level.INFO,"EXECUTING TI_RATING insert");
             insertStmt.executeUpdate();
             
             LOGGER.log(Level.FINER,"EXECUTED "+  insertSQL);
    
          } catch (Exception e) {
    
          }
    
       }
    
    }
    
  6. The preceding two Java classes require another six to function, including RatingXMLReader. These classes can also be found in the src.zip archive provided.
    Note: RatingXMLReader is a simple Java Class that takes an XML file name and parses the file. The format of the XML decides how the implementation of this class is performed. The format of the XML and how to parse it is left open. For example, the following is a sample code snippet of the Ratings.xml file:
    <?xml version="1.0" encoding="utf-8"?>
    <customInfo>
    <product partNumber="AC-01">
    <rating type="quality">
    <averageRating>1.7</averageRating>
    <reviewCount>60</reviewCount>
    </rating>
    </product>
    <product partNumber="AC-0101">
    <rating type="quality">
    <averageRating>4.6</averageRating>
    <reviewCount>85</reviewCount>
    </rating>
    </product>
    </customInfo>
    
  7. For the preprocess utilities to find the classes at run time, you must package the classes in a JAR file in the following location:
    • WebSphere Commerce DeveloperWCDE_installdir\workspace\WC
    • SolarisLinuxAIXWindowsWC_installdir\instances\instance_name\wc.ear