Data extraction utility for dynamic recommendations in

The Intelligent Offer data extraction utility is a command-line utility that you can use to create the Enterprise Product Report (EPR) data for dynamic recommendations that is required by . The utility extracts catalog data from your database and generates ECDF and EPCMF files in the correct format to load into . You can provide these two files to regularly for processing dynamic recommendations.

The data extraction utility generates two CSV (comma-separated value) file types that contain your catalog data for each IBM Digital Analytics client ID:
EPCMF (Enterprise Product Content Mapping File)
This file contains data that represents catalog entries, that is, products that can be bought, pre-built kits, and dynamic kits for a store. This file also specifies the master catalog category to which the catalog entry belongs.
ECDF (Enterprise Category Definition File)
This file contains data that represents the master catalog category hierarchies for a store.

Sample of the generated EPCMF file

This sample shows the catalog entry data that the utility extracts for the EPCMF file:
Sample EPCMF file

This file contains up to 55 columns:

  • The first five columns contain mandatory data that requires:
    File date
    The date that the utility created the CSV file, in YYYYMMDD format.
    Client ID
    The IBM Digital Analytics client ID.
    Item ID
    The part number of the catalog entry.
    Item
    The name of the catalog entry.
    Items Primary Category ID
    The master catalog category to which the catalog entry belongs.
  • The remaining 50 columns are for customer-defined static attributes for catalog entries. Data mappings for the first six of these static attribute columns are predefined to contain specific catalog entry data, but you can change the predefined contents. For more information, see the data mapping descriptions in Sample business object configuration file for EPCMF data.

Sample of the generated ECDF file

This sample shows the catalog hierarchy data that the utility extracts for the ECDF file:
Sample ECDF file
The five columns in this file contain mandatory data that requires:
File date
The date that the utility created the CSV file, in YYYYMMDD format.
Client ID
The IBM Digital Analytics client ID.
Category ID
The category identifier.
Category Name
The name of the category.
Parent Category ID
The category identifier of the parent category.

Configuration files for the data extraction utility

The data extraction utility uses three types of configuration files. Samples are provided, but you must update the samples with configuration information specific to your site and environment. These configuration files are based on the Data Load utility configuration files, but they include some extensions.
wc-dataextract.xml
This file is the main configuration file that you must point to when you run the utility. This file specifies the paths to the environment configuration file and to the business object configuration file.
wc-dataextract-env.xml
This file is the environment configuration file. You must configure the language of the store and the currency for the price data before you run the utility.
wc-dataextract-business_object.xml
This file is the business object configuration file. For this utility, you need two versions of this file:
  • wc-dataextract-catalog-entry.xml: This business object configuration file is used to extract catalog entry data for the EPCMF file.
  • wc-dataextract-catalog-group.xml: This business object configuration file is used to extract category data for the ECDF file.
These files contain:
  • Business context information.
  • Data mappings that are required to transform WebSphere Commerce business objects to the data that is written to columns in the EPCMF or ECDF file. The EPCMF file supports up to 15 customer-defined static attributes for catalog entries.
  • Definitions for the order that the utility writes the data to the columns in the file.
  • Pointers to interfaces and implementation classes that the utility uses.

Using the utility in different environments

The data extraction utility for can be run in the staging and production environments. However, you are recommended to run the utility in an environment that has all of the information that is required. For example, the staging environment might not have inventory or pricing information. In this case, run the utility on the production environment.

You can generate the CSV files in your staging environment to load into your test environment. You can also generate the CSV files in your production environment to load into your production environment. The utility is not intended to be run in the development environment. Support is provided in the development environment with a Derby database for customization purposes only. For example, when you are testing changes to the business object configuration file to include custom catalog entry attributes for the EPCMF file.