Index Load configuration files for indexing from CSV files

You can load index information from a CSV file. Index Load requires configuration files before it can be run from a web browser.

Loading the index from a CSV file

Follow these steps to load index information from a CSV file.

  1. Edit the wc-dataload-profile.xml configuration file, and add the CSV file location, and the target core name.
  2. Specify CSVReader as the reader, and SolrIndexLoadMapObjectBuilder as the business object builder in wc-businessObject-profile.xml.
If you are using a CSV file to load index data, Index Load requires three configuration files. These files are based on the XML schema definitions of the Data Load framework:

Index Load configuration files

Index Load configuration file Schema definition file
Environment configuration file (wc-indexload-env.xml) wc-dataload-env.xsd
Profile configuration file (wc-indexload-profileName.xml) wc-indexload.xsd
Profile item configuration file (wc-indexload-businessobject.xml) wc-indexload-item.xsd

Environment configuration file (wc-indexload-env.xml)

The wc-indexload-env.xml file contains environment control information and global properties that are required by Index Load, including a common data writer and data source to be used to persist the data.

The wc-indexload-env.xml file does not typically require customization. You can use the default sample file as-is.

Profile configuration file (wc-indexload-profileName.xml)

The wc-indexload-profileName.xml file contains configurable performance attributes and load item configurations.

Profile names that you define in configuration files are then substituted in as a URL parameter when you call Index Load in a web browser.

The load item configurations are listed under the load order section of this file. They are processed in the same order as they are specified.

It can contain one or multiple LoadItem definitions, with every LoadItem configuration specifying the specific LoadItem configuration and coreName target. Multiple LoadItems are run in parallel, without sequence.

Example: wc-indexload-price.xml

<_config:LoadItem name="ExternalPrice-1" businessObjectConfigFile="wc-indexload-price-sql.xml">
			<_config:property name="coreName" value="MC_10001_CatalogEntry_Price_generic" />
			<_config:property name="groupName" value="1" />
	  </_config:LoadItem>

The following configurable performance attributes apply to profile configuration files:
batchSize
The threshold when documents are soft committed in memory.
The default value is 1. If a value of 0 is specified, it does not commit until the load item finishes.
commitCount
The threshold when documents are hard committed to disk from memory.
You can use a commitCount of 0 if you use a memory-based commit. For more information, see Tuning Index Load.
ThreadLaunchTimeDelay
The amount of time in milliseconds to wait before starting another new thread to avoid overloading the system at startup.
The default value is 1000.
OptimizeAfterIndexing
Indicates whether Index Load performs index optimization after commit.
Note: Performing optimization after a full indexing improves runtime performance; however, it increases the overall indexing time.
StatusRefreshInterval
The maximum amount of time in seconds to wait before refreshing the current Index Load status and display it in the administrative log.
The default value is 300. Use a value of -1 to disable the service.
DocumentSizeSamplingInterval
The time interval in seconds to calculate the size of the indexed document. Use -1 to disable the service. The default value is 300.
IndexHeightCacheHint
A number that hints the system to determine the size of the applicable caches for index height that is used during indexing.
IndexWidthCacheHint
A number that hints the system to determine the size of the applicable caches for index width that is used during indexing.

Profile item configuration file (wc-indexload-external-price.xml)

<_config:LoadItem name="ExternalPrice-1" businessObjectConfigFile="wc-indexload-external-price.xml">
<_config:property name="coreName" value="MC_10001_CatalogEntry_Price_generic" />
<_config:DataSourceLocation location="C:\Patches\delta.csv" />
 </_config:LoadItem>
Where
coreName
The name of the extension core name.
DataSourceLocation
The location to the CSV data file.

Sample configuration files

Download and extract the following sample code: IndexLoadSampleCode.zip. The sample includes configuration files that are used by Index Load, and manual updates that are performed in the Indexing contract prices using Index Load task, for reference.