Truncate multibyte strings

Sometimes you need to truncate a long character string so that it fits into a smaller buffer. When you truncate a character string that contains just single-byte characters, you can truncate at an arbitrary byte location in the string. Because each character is one byte long, the truncated result still contains only complete characters.

However, to truncate a string that might contain even one multibyte character, you must take special measures. If you truncate at an arbitrary byte location in a multibyte-character string, you might truncate at a byte that is part of a multibyte character. In this case, the truncated string might end with only the first 1, 2, or 3 bytes of a multibyte character without the remaining bytes of the character. For such a string, subsequent traversal could result in an attempt to read beyond the end of the buffer.

Therefore, all functions that traverse one multibyte character or a length-terminated multibyte-character string set the error number to IFX_GL_EINVAL if they detect that an otherwise valid character has been truncated.

If you know that no truncation has occurred to the string, you can consider the IFX_GL_EINVAL error the same as IFX_GL_EILSEQ. However, if truncation might have occurred, IFX_GL_EINVAL indicates that you need to further truncate the string so that the last character in the string is complete. Depending on your application, you might take one of the following actions:
  • Make the truncated string even shorter than originally intended.
  • Replace the first 1, 2, or 3 bytes of the truncated character with a padding character that is appropriate for your application.
Important: Even though the library functions can detect invalid characters after truncation has occurred, it is much better to avoid the situation.