Vendor database character sets and the HCL Compass data code page

The vendor database character set describes the setting for the database management system (DBMS) that determines which characters can be stored in the database.

DBMS vendors use a variety of terms to describe their character sets. The next table lists the alternate terms used by DBMS vendors who support HCL Compass
Database management system Vendor database character set synonyms
SQL Server code page, collation

The HCL Compass data code page setting determines which characters are written to the database.

Before you set up your database management system to be used with HCL Compass, you must choose the HCL Compass data code page to set for your schema repository and user databases. For more information, see Guidelines for selecting a HCL Compass data code page. When you set up your database management system, assign a vendor database character set value that corresponds to the HCL Compass data code page you selected. All databases in a database set must have the same vendor database character set.

If you configure your database management system with a vendor database character set that does not support the HCL Compass data code page you selected for the schema repository, you cannot set the data code page. Therefore, you must always know the value of the data code page value is before you create and configure a vendor database to use with a HCL Compass schema repository.

In general, set the HCL Compass data code page and the vendor database character set to the corresponding values in Supported vendor database character sets.

However, the HCL Compass data code page and the vendor database character set can be different from the values listed in Supported vendor database character sets, if both of these conditions are true:
  • The characters of the HCL Compass data code page are a subset of the characters of the vendor database character set.
  • The database currently contains only the characters supported by the HCL Compass data code page.

For example, it is possible to use the HCL Compass data code page 20127 (ASCII) with SQL Server database code page 1252 (Latin-1). The database can store all the characters that are valid in the data code page because ASCII is a subset of Latin-1.

However, because these situations are so variable, the safest practice is to set the HCL Compass data code page and the vendor database character set to the corresponding values.

Validating the vendor database character set

The HCL Compass data code page value is validated against the value of the vendor database character set when you perform these tasks:

  • Create a schema repository and set the HCL Compass data code page.
  • Use the Maintenance Tool to change a schema repository's data code page value.
  • Set or change the data code page value for a schema repository by using the installutil command.

Compatibility with existing database sets

You might be required to change your previous vendor database character sets to support data from your preferred HCL Compass data code page. To change the vendor database character set of existing databases, you might have to move the old data into new databases. If your existing data is not supported by one of the HCL Compass data code pages, you must first convert the data to values in one of the supported code pages. Some database vendors provide tools that you can use to analyze and convert your data. You can also use the codepageutil command to analyze your data.