p If .Fa s is a null pointer, the .Nm call is equivalent to: d -ragged -offset indent .Fo mbrtoc8 .Li NULL , .Li \*q\*q , .Li 1 , .Fa ps .Fc .Ed
p This always returns zero, and has the effect of resetting .Fa ps to the initial conversion state, without writing to .Fa pc8 , even if it is nonnull.
p If .Fa ps is a null pointer, .Nm uses an internal .Vt mbstate_t object with static storage duration, distinct from all other .Vt mbstate_t objects
o including those used by .Xr mbrtoc16 3 , .Xr mbrtoc32 3 , .Xr c8rtomb 3 , .Xr c16rtomb 3 , and .Xr c32rtomb 3
c ,
which is initialized at program startup to the initial conversion
state.
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh RETURN VALUES
The
.Nm
function returns:
l -tag -width Li t Li 0 q null if within the next
.Fa n
bytes at
.Fa s
the first multibyte character is null.
t Ar i q code unit where
.Li 0
\*(Le
.Ar i
\*(Le
.Fa n ,
if either
.Fa ps
is in the initial conversion state or the previous call to
.Nm
with
.Fa ps
had not yielded an incomplete UTF-8 code unit, and within the first
.Ar i
bytes at
.Fa s
a Unicode scalar value was decoded.
t Li (size_t)-3 q continuation if the previous call to
.Nm
with
.Fa ps
had yielded an incomplete UTF-8 code unit for a Unicode scalar value
outside the US-ASCII range; no additional input is consumed in this
case.
t Li (size_t)-2 q incomplete if either
.Fa ps
is in the initial conversion state or the previous call to
.Nm
with
.Fa ps
had not yielded an incomplete UTF-8 code unit, and within the first
.Fa n
bytes at
.Fa s ,
including any previously buffered input, no complete Unicode scalar
value could be decoded.
t Li (size_t)-1 q error if any encoding error was detected;
.Xr errno 2
is set to reflect the error.
.El
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh EXAMPLES
Print the UTF-8 code units of a multibyte string in hexadecimal text:
d -literal -offset indent char *s = ...;
size_t n = ...;
mbstate_t mbs = {0}; /* initial conversion state */
while (n) {
char8_t c8;
size_t len;
len = mbrtoc8(&c8, s, n, &mbs);
switch (len) {
case 0: /* null terminator */
assert(c8 == '\e0');
goto out;
default: /* consumed input and yielded a byte c8 */
printf("0x%02hhx\en", c8);
break;
case (size_t)-3: /* yielded a pending byte c8 */
printf("continue 0x%02hhx\en", c8);
break;
case (size_t)-2: /* incomplete */
printf("incomplete\en");
goto readmore;
case (size_t)-1: /* error */
printf("error: %d\en", errno);
goto out;
}
s += len;
n -= len;
}
.Ed
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh ERRORS
l -tag -width Bq t Bq Er EILSEQ The multibyte sequence cannot be decoded as a Unicode scalar value.
t Bq Er EIO An error occurred in loading the locale's character conversions.
.El
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh SEE ALSO
.Xr c8rtomb 3 ,
.Xr c16rtomb 3 ,
.Xr c32rtomb 3 ,
.Xr mbrtoc16 3 ,
.Xr mbrtoc32 3 ,
.Xr uchar 3
.Rs
.%B The Unicode Standard
.%O Version 15.0 \(em Core Specification
.%Q The Unicode Consortium
.%D September 2022
.%U https://www.unicode.org/versions/Unicode15.0.0/UnicodeStandard-15.0.pdf
.Re
.Rs
.%A F. Yergeau
.%T UTF-8, a transformation format of ISO 10646
.%R RFC 3629
.%D November 2003
.%I Internet Engineering Task Force
.%U https://datatracker.ietf.org/doc/html/rfc3629
.Re
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh STANDARDS
The
.Nm
function conforms to
.St -isoC-2023 .
.\" XXX PR misc/58600: man pages lack C17, C23, C++98, C++03, C++11, C++17, C++20, C++23 citation syntax
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh HISTORY
The
.Nm
function first appeared in
.Nx 11.0 .