ORDER BY and non-English data

By default, HCL® OneDB® database servers use the U.S. English language environment, called a locale, for database data. The U.S. English locale specifies data sorted in code-set order. This default locale uses the ISO 8859-1 code set.

If your database contains non-English data, you should store non-English data in NCHAR (or NVARCHAR) columns to obtain results sorted by the language. The ORDER BY clause should return data in the order appropriate to that language.

The following query uses a SELECT statement with an ORDER BY clause to search the table, abonnés, and to order the selected information by the data in the nom column.
Figure 1: Query
SELECT numéro,nom,prénom 
   FROM abonnés
   ORDER BY nom;
The collation order for the results of this query can vary, depending on the following system variations:
  • Whether the nom column is CHAR or NCHAR data type. The database server sorts data in CHAR columns by the order the characters appear in the code set. The database server sorts data in NCHAR columns by the order the characters are listed in the collation portion of the locale.
  • Whether the database server is using the correct non-English locale when it accesses the database. To use a non-English locale, you must set the CLIENT_LOCALE and DB_LOCALE environment variables to the appropriate locale name.

For the query to return expected results, the nom column should be NCHAR data type in a database that uses a French locale. Other operations, such as less than, greater than, or equal to, are also affected by the user-specified locale. For more information on non-English data and locales, see the HCL OneDB GLS User's Guide.

The following result and Query result show two sample sets of output.
Figure 2: Query result
numéro      nom                prénom

13612       Azevedo            Edouardo Freire
13606       Dupré              Michle Françoise
13607       Hammer             Gerhard
13602       Hämmer             le Greta
13604       LaForêt            Jean-Nol
13610       LeMatre           Hélose
13613       Llanero            Gloria Dolores
13603       Montaña            José Antonio
13611       Oatfield           Emily
13609       Tiramis           Paolo Alfredo
13600       da Sousa           João Lourenço Antunes
13615       di Girolamo        Giuseppe
13601       Ålesund            Sverre
13608       Étaix              Émile
13605       Ötker              Hans-Jrgen
13614       Øverst             Per-Anders
The following query result follows the ISO 8859-1 code-set order, which ranks uppercase letters before lowercase letters and moves names that contain an accented character (Ålesund, Étaix, Ötker, and Øverst) to the end of the list.
Figure 3: Query result
numéro        nom            prénom

13601         Ålesund        Sverre
13612         Azevedo        Edouardo Freire
13600         da Sousa       João Lourenço Antunes
13615         di Girolamo    Giuseppe
13606         Dupré          Michle Françoise
13608         Étaix          Émile
13607         Hammer         Gerhard
13602         Hämmer         le Greta
13604         LaForêt        Jean-Nol
13610         LeMatre       Hélose
13613         Llanero        Gloria Dolores
13603         Montaña        José Antonio
13611         Oatfield       Emily
13605         Ötker          Hans-Jrgen
13614         Øverst         Per-Anders
13609         Tiramis       Paolo Alfredo

The result shows that when the appropriate locale file is referenced by the database server, names including non-English characters (Ålesund, Étaix, Ötker, and Øverst) are collated differently than they are in the ISO 8859-1 code set. They are sorted correctly for the locale. It does not distinguish between uppercase and lowercase letters.