
Configuring the XML data reader
Configure the extensible markup language (XML) data reader in the business object configuration file to modify the way that data is read from XML formatted source files. You might want to change the default settings of the XML data reader to better work with the format of your data.
The data that is read from an XML file can be mapped to a WebSphere Commerce business object by using a business object configuration file. Using the configuration file, each element of data in the input XML file can be mapped directly to a property of a WebSphere Commerce business object. This handler reads and creates a name-value pair (NVP) mapping one record at a time and then passes each mapping to a business object builder.
maxError
attribute within the data
load order configuration file. If you do include this attribute, you
must set the value for the attribute to be 1
. If
you set a different value, you can encounter unexpected behavior.Procedure
- Locate the wc-loader-<object>.xml business
object configuration file for the business object type that you are
loading. Open the configuration file for editing.Sample business object configuration files are in the following directory:
WC_installdir/samples/DataLoad/Catalog
WCDE_installdir/samples/DataLoad/Catalog
- Find the data reader configuration element:
<_config:DataReader className="com.ibm.commerce.foundation.dataload.datareader.XmlReader" > </_config:DataReader>
- Optional: Add the handler classes within the
data reader configuration element to change how the Data Load utility
handles loading your XML data.To add an XML handler class, you must specify the class in the following format
<_config:XMLHandler className=""/>
. For example, the following configuration adds theNVPXmlHandler
XML handler class into the data reader configuration:<_config:DataReader className="com.ibm.commerce.foundation.dataload.datareader.XmlReader" > <_config:XmlHandler className="com.ibm.commerce.foundation.dataload.xmlhandler.NVPXmlHandler" /> </_config:DataReader>
The following class is available by default:
- NVPXmlHandler
- This handler class is the default handler for the XmlReader and
is used to handle generic XML data that follows a specific CSV-like
file format. This handler reads each second-level element as a separate
object record. This handler parses your input file one object record
at a time and generates a hash map for each record that is then passed
to the business object builder. The key of this map is the element
or attribute name for the objects that you are loading of a particular
business object type. You can modify this default behavior by specifying
the following parameters: xpathEnabled, qualifiedName, and nvpReMapping.
See the detailed descriptions of how to use these optional properties
in the following step.
If you do not specify an XML handler in your data reader configuration, this handler is used. All data load configuration files that are used for loading CSV input files can be used to load XML input files. The Data Load framework switches the data reader that is used automatically depending on the file type, either CSV or XML, of the input file.
- Optional: Add configuration properties within
the data reader configuration element meet your data loading requirements.
To add a configuration property, you must specify the property in the following formatThe following optional properties are available for use with the default NVPXmlHandler class:
<_config:property name="" value""/>
. For example, the following configuration adds therecordXpath
configuration property for a catalog entry into the data reader configuration:<_config:DataReader className="com.ibm.commerce.foundation.dataload.datareader.XmlReader" > <_config:property name="recordXpath" value="CatalogEntry" /> </_config:DataReader>
- recordXpath
- If your input file has the object element nested deeply, you can
set the XPath to have the handler start reading the nested object
element as the root element. When you specify this property, any XML
element that has a value that matches the XPath value of this property
is handled by the Data Load utility as a separate record. If you do
not specify this property, only the second-level XML elements are
handled as individual object records.
Specify the value for this parameter to be an XPath. The XPath can be absolute XPath or relative XPath. An XPath is an absolute XPath if it starts with the forward slash /. The relative XPath is just a single element name. For example, you can specify the following absolute XPath:
or the following relative XPath:<_config:property name="recordXpath" value="/Object/ObjectType/CatalogEntry" />
This XPath ensures that the object<_config:property name="recordXpath" value="CatalogEntry" />
<CatalogEntry>
in the following sample is read as the record element:
The other elements,<Object> <ObjectType> <CatalogEntry> <PartNumber>productPartNumber-1</PartNumber> </CatalogEntry> </ObjectType> <Object>
<Object>
, and<ObjectType>
are ignored. - xpathEnabled
- If your element names are not unique, you can use this property
to use the XPath to create uniqueness in the NVP pair mapping. If
you specify this property with a value of true, the key for mapping
your data during the Data Load process uses the XPath to the element.
If this value is false, the key for mapping your data is the element
name or attribute name. The XPath that is used is relative to your
element record. The default value for this property is false. Note: If you set this property as true, you must also change the value for the mapping of your object in the data load business object configuration file.For example, if your input file contains the following catalog entry element:
If you set the xpathEnable to be true, the XML handler builds the following mapping:<CatalogEntry catalogEntryTypeCode="ProductBean" displaySequence="1.0"> <PartNumber>productPartNumber-1<PartNumber> <Description> <Name>name-1<Name> <Description> </CatalogEntry>
The keys in the mapping are the XPath which always relative the root of your record element CatalogEntry without starting with the forward slash /. The attribute is treated like an element in the XPath.catalogEntryTypeCode = ProductBean displaySequence = 1.0 PartNumber = productPartNumber-1 Description/Name = name-1
- nvpReMapping
- This property controls how to redo the NVP mapping of your data
that is passed for an object record to the business object builder.
The value of this property defines a list of remapping rules for your
data. If the elements that contain information for your object contain
names that are not unique, you can use this configuration property
to ensure uniqueness. For example, within a catalog entry, object
elements for the catalog entry
<name>shirt</name>
and attribute<name>color</name>
can exist. The XML handler reads the values for these elements as two values for a singlename
element and records these values as list in the NVP mapping,name=[shirt, color]
. By remapping the XPath for these elements, you can ensure that the handler reads and maps these elements and values correctly.Your list of NVP remapping rules must have each rule separated by a '|' character. Each rule contains three tokens that are separated by a comma ',' character. The first token is for the new key in the remapping. The second token is for the new value in the remapping, and the third token is for the prefix in the remapping key.
For example, if your input file contains the following catalog entry elements:
The handler class, by default, reads the XPath for the following description elements<CatalogEntry> <CatalogEntryIdentifier> <ExternalIdentifier> <PartNumber>productPartNumber-1</PartNumber> </ExternalIdentifier> </CatalogEntryIdentifier> <Description> <Attributes name="auxDescription1">auxDesc1-1</Attributes> <Attributes name="auxDescription2">auxDesc2-1</Attributes> <Attributes name="published">1</Attributes> </Description>
The handler maps these elements as two elements:name=[auxDescription1, auxDescription2, published], Attributes=[auxDesc1-1, auxDesc1-2, 1]
If you set the remapping configuration property to be:name=[auxDescription1, auxDescription2, published] Attributes=[auxDesc1-1, auxDesc2-1, 1]
The handler reads the elements as three separate elements and maps these elements as<_config:property name="nvpReMapping" value="name, Attributes, " />
If you specify the remapping rule that contains the prefix:auxDescription1 = auxDesc1-1 auxDescription2 = auxDesc2-1 published = 1
These elements are read and mapped as<_config:property name="nvpReMapping" value="name, Attributes, Description/Attributes/name/" />
Description/Attributes/name/auxDescription1 = auxDesc1-1 Description/Attributes/name/auxDescription2 = auxDesc2-1 Description/Attributes/name/published = 1
Note: If you do change the NVP mapping for an object, you must also change the value for the mapping of your object in the data load business object configuration file. For example, to map this data to use the remapping rules, your business object configuration mapping can be:
The value prefix<_config:mapping xpath="Description/Attributes/auxDescription1" value="Description/Attributes/name/auxDescription1" /> <_config:mapping xpath="Description/Attributes/auxDescription2" value="Description/Attributes/name/auxDescription2" /> <_config:mapping xpath="Description/Attributes/published" value="Description/Attributes/name/published" />
Description/Attributes/name
is optional, if you do not use the prefix, your mapping can resemble:<_config:mapping xpath="Description/Attributes/auxDescription1" value="auxDescription1" /> <_config:mapping xpath="Description/Attributes/auxDescription2" value="auxDescription2" /> <_config:mapping xpath="Description/Attributes/published" value="published" />
- qualifiedName
- The qualified name is used to ensure the uniqueness of the data
elements that you are loading. This uniqueness is achieved by the
inclusion of the namespace as part of the name for your element data
in the NVP pair mapping. Specify this property value as true to include
the namespace as part of the key to the map that is passed to your
business object builder. The default value is false.Note: If you set this property as true, you must also change the value for the mapping of your object in the data load business object configuration file.
- Optional: Configure your Data Load process
to include a data reader preprocess. To configure a preprocessor to run, you must specify the preprocessor class in the following format:
For example, the following configuration specifies that a file difference preprocessor is to run:<_config:DataReaderPreprocessor className="" />
<_config:DataReader className="com.ibm.commerce.foundation.dataload.datareader.XmlReader" > <_config:DataReaderPreprocessor className="com.ibm.commerce.foundation.dataload.datareader.XmlFileDiffPreprocessor" /> </_config:DataReader>
The following data reader preprocessor is available for use with the Data Load utility:- com.ibm.commerce.foundation.dataload.datareader.XmlFileDiffPreprocessor
- This preprocessor compares a specified old and new input file and generates a new file that contains only the differences that exist in the new file. This preprocessor can improve the performance of routine Data Load operations by avoiding loading data that was loaded previously. For more information about this preprocessor, see Data Load file difference preprocessing. If you are running this preprocessor, you can also include more configuration properties specific to this preprocessor. For more information about configuring this preprocessor, and the configuration properties available for this preprocessor, see Configuring the Data Load utility to run a file difference preprocess.
- Save and close your file.