webFeedLoad best practices

When running the webFeedLoad utility follow these best practices.

  1. Processing large feeds

    If you are processing more than 1000 entries in a feed where the feed content is stored as managed files; ensure that the EAR updater job, ScheduledContentManagedFileEARUpdateCmdImpl, is stopped before the feed retriever scheduled job, FeedDataloadSchedulerCmd, runs.

  2. Changing data load configuration files
    The feed retriever generates a set of data load configuration files, to tune these generated files for performance or other reasons.
    1. Run the webFeedLoad utility with the parameter -DGenerateDataLoadConfigOnly=true. This option generates the data load configuration files, but, does not process the feed.
    2. Ensure that you follow the best practices described for Data Load configuration files when you modify generated files.
    3. If the feed configuration has not changed, set the parameter -DGenerateDataLoadConfigOnly parameter to false for subsequent runs of feed retrieval.
  3. Delta feed retrieval and processing
    If your database is large, set the ID resolver cache size to 0 for small delta loads. For example, in the wc-dataload-env.xml file specify the ID resolver cache size to 0:
    <_config:IDResolver className="com.ibm.commerce.foundation.dataload.idresolve.IDResolverImpl" cacheSize="0" />
  4. Running the feed retrieval

    Ensure that the FeedDataloadSchedulerCmd and the webFeedLoad utility are not run simultaneously. Ensure that the batch script finishes before the scheduled job for FeedDataloadSchedulerCmd begins.