Configuring the CSV data reader

Configure the comma separated values (CSV) data reader in the business object configuration file to modify the way data is read from CSV source files. You might want to change the default settings of the CSV data reader to better work with the format of your existing source data.

The CSV data reader reads and processes data from an input CSV file one line at a time. Each line in the CSV file must have the same data structure. The data read from the CSV file can be mapped to a WebSphere Commerce business object by using a business object configuration file. Using the configuration file, each column of data in the input CSV file is mapped directly to a property of a WebSphere Commerce business object.

Procedure

  1. Open the wc-loader-<object>.xml configuration file in edit mode. A sample of this file is located in the WC_installdir/samples/DataLoad/Catalog directory.
  2. Find the <_config:DataReader> element.
  3. Add the following optional parameters inside the <_config:DataReader> tag:
    lineDelimiter
    Specifies the line separator character or record separator character. The default value is the new line character. The lineDelimiter character cannot appear in the content of a token unless enclosed within the tokenValueDelimiter character.
    tokenDelimiter
    Specifies the token separator character for one record. The default is the comma character (,).
    tokenValueDelimiter
    Specifies the string separator character. The tokenValueDelimiter is used to indicate the beginning and the end of a string. The default tokenValueDelimiter character is the double quotation mark ("). For example, the following token containing commas can be used for a catalog entry short description:
    "Men's fashions for business, casual, and formal occasions"
    Note: If you are editing your file with a plain text editor, use the tokenValueDelimiter when your token contains special characters, such as the tokenDelimiter character or the tokenValueDelimiter itself. To use the tokenValueDelimiter character within the token, you must use two tokenValueDelimiter characters. For example, the following token containing commas and quotation marks can be used for a catalog entry short description:
    "Men's fashions for ""business"", ""casual"", and ""formal"" occasions."
    The output will look like this:
    Men's fashions for "business", "casual", and "formal" occasions.
    These usages of the tokenValueDelimeter apply only when you are using a plain text editor to edit your file.
    charset
    Specifies the character set of the CSV file. The default character set is UTF-8.
    firstLineIsHeader
    Indicates that the first line in the CSV file is column header information. Use this instead for providing the column mappings in the <_config: Data> element in the wc-loader-<object>.xml configuration file. The default value is false.
    useHeaderAsColumnName
    Indicates that the first line in the CSV file will be used as column information. The default value for useHeaderAsColumnName is false. There are four possible combinations of the firstLineIsHeader and useHeaderAsColumnName parameters:
    1. firstLineIsHeader = "false" and useHeaderAsColumnName = "false". In this case, the column mappings in the wc-loader-<object>.xml configuration file is mandatory.
    2. firstLineIsHeader = "false" and useHeaderAsColumnName = "true". In this case, the useHeaderAsColumnName flag is ignored and the column mapping is mandatory.
    3. firstLineIsHeader = "true" and useHeaderAsColumnName = "false". In this case, the column mapping configuration is optional. If the column mapping configuration is defined in the wc-loader-<object>.xml configuration file, use the column mapping configuration. If not, use the CSV header for the column names.
    4. firstLineIsHeader = "true" and useHeaderAsColumnName = "true". In this case, the column mapping configuration will be ignored and always use the CSV header for the column names.
  4. Save and close the file.

Example

The following code snippet demonstrates how to use the parameters, using all default values:
<_config:DataReader lineDelimiter="\n" tokenDelimiter="," tokenValueDelimiter='"' 
charset="UTF-8" firstLineIsHeader="false" useHeaderAsColumnName="false" />