Character/string comparison and sorting

Collation involves the sorting of character data that is either stored in a database or manipulated in a client application.

HCL Informix® database servers support the following methods of collation for character data:
  • Code-set collation order is the bit-pattern order of characters within a code set.

    The order of the code points in the code set determines the sort order.

  • Locale-specific order is an order of the characters that relates to a real language.

    The LC_COLLATE category of a GLS locale file defines the order of the characters in the locale-specific order.

For more information about code-set and locale-specific order, see the Informix GLS User's Guide.

To perform code-set collation, you compare only the integer values of two multibyte or two wide characters. For example, suppose one multibyte character, mbs1, contains the value A1A2A3 and a second multibyte character, mbs2, contains the value B1B2B3. If the integer value of A1A2A3 is less than the integer value of B1B2B3, then mbs1 is less than mbs2 in code-set collation order.

However, sometimes you want to sort character data according to the language usage of the characters. In code-set order, the character a is greater than the character A. In many contexts, you would probably not want the string Apple to sort before the string apple. The locale-specific order could list the character A after the character a. Similarly, even though the character might have a code point of 133, the locale-specific order could list this character after A and before B (A=65, =133, B=66). In this case, the string B sorts after AC but before BD.

The following table lists the functions that use locale-specific order to compare two multibyte-character or wide-character strings.
String-comparison task Multibyte-character function Wide-character function
Compare two character strings by locale-specific order. ifx_gl_mbscoll() ifx_gl_wcscoll()
These functions access the LC_COLLATE category of a locale file to obtain localized collating information when they compare character strings. They return an integer value that indicates the results of the comparison between two string arguments. The following table shows the comparison between the first string argument (Arg 1) and the second string argument (Arg 2), as well as the return values.
Argument comparison Return value
Arg 1 < Arg 2 <0
Arg 1 = Arg 2 0
Arg 1 > Arg 2 >0
The ifx_gl_mbscoll() and ifx_gl_wcscoll() functions do not return a special value if an error has occurred. Therefore, to detect an error condition, you must initialize the ifx_gl_lc_errno() error number to zero before you call one of these functions and check ifx_gl_lc_errno() after you call the function. The following code fragment performs error checking for a call to the ifx_gl_wcscoll() function:
/* Initialize the error number */
ifx_gl_lc_errno() = 0;

/* Compare the two wide-character strings */
value = ifx_gl_wcscoll(wcs1, wcs1_char_length, wcs2,
wcs2_char_length);

/* If the error number has changed, ifx_gl_wcscoll() has
 * set it to indicate the cause of an error */
if ( ifx_gl_lc_errno() != 0 )
    /* Handle error */
else if ( value < 0 )
    /* wcs1 is less than wcs2 */
else if ( value == 0 )
    /* wcs1 is equal to wcs2 */
else if ( value > 0 )
    /* wcs1 is greater than wcs2 */
...