Dico |
|
GNU Dictionary Server |
Sergey Poznyakoff |
D.10 UTF-8
This section describes functions for handling UTF-8 strings. A UTF-8 character can be represented either as a multi-byte character or a wide character.
Multibyte character is a char *
pointing to one or more
bytes representing the UTF-8 character.
Wide character is an unsigned
value identifying the
character.
In the discussion below, a sequence of one or more multi-byte characters is called a multi-byte string. Multibyte strings terminate with a single ‘nul’ (0) character.
A sequence of one or more wide characters is called a wide character string. Such strings terminate with a single 0 value.
This document was generated on September 4, 2020 using makeinfo.
Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.