Filter non-ASCII characters

As part of the compilation process of the HCL OneDB™ ESQL/C source program, the HCL OneDB ESQL/C processor calls the C compiler. When you develop HCL OneDB ESQL/C source code that contains non-ASCII characters, the way that the C compiler handles such characters can affect the success of the compilation process.

In particular, the following situations might affect compilation of your HCL OneDB ESQL/C program:
  • Multibyte characters might contain C-language tokens.

    A component of a multibyte character might be indistinguishable from some single-byte characters such as percent ( % ), comma ( , ), backslash ( \ ), and double quotation mark ( " ) characters. If such characters are included in a quoted string, the C compiler might interpret them as C-language tokens, which can cause compilation errors or even lost characters.

  • The C compiler might not be 8-bit clean.

    If a code set contains non-ASCII characters (with code values that are greater than 127), the C compiler must be 8-bit clean to interpret the characters. To be 8-bit clean, a compiler must read the eighth bit as part of the code value; it must not ignore or put its own interpretation on the meaning of this eighth bit.

To filter a non-ASCII character, the HCL OneDB ESQL/C filter converts each byte of the character to its octal equivalent. For example, suppose the multibyte character A1A2A3 has an octal representation of \160\042\244 and appears in the stcopy() call.
stcopy("A1A2A3", dest);
After esqlmf filters the HCL OneDB ESQL/C source file, the C compiler sees this line as follows:
stcopy("\160\042\244", dest); /* correct interpretation */
To handle the C-language-token situation, the filter prevents the C compiler from interpreting the A2 byte (octal \042) as an ASCII double quotation mark and incorrectly parsing the line as follows:
stcopy("A1"A3, dest); /* incorrect interpretation of A2 */

The C compiler would generate an error for the preceding line because the line has terminated the string argument incorrectly. The esqlmf utility also handles the 8-bit-clean situation because it prevents the C compiler from ignoring the eighth bit of the A3 byte. If the compiler ignores the eighth bit, it incorrectly interprets A3 (octal \244) as octal \044.