Home | History | Annotate | only in /src/tests/lib/libc/locale
History log of /src/tests/lib/libc/locale
RevisionDateAuthorComments
 1.18 15-Aug-2024  riastradh libc: New functions c8rtomb(3) and mbrtoc8(3).

New in C23, for converting from UTF-8 to locale-dependent multibyte
sequences (c8rtomb) or vice versa (mbrtoc8), along with the new type
char8_t.

Conditional on either:
- _NETBSD_SOURCE
- _ISOC23_SOURCE
- __STDC_VERSION__ >= 202311L

(Riding the libc minor bump from this morning for the UTF-16/UTF-32
versions from C11.)

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
 1.17 15-Aug-2024  riastradh libc: New C11 functions mbrtoc16, mbrtoc32, c16rtomb, c32rtomb.

The mbrtoc16/32 functions read mulitbyte strings according to the
current locale into UTF-16/32 code unit sequences; the c16/32rtomb
functions write UTF-16/32 code unit sequences into multibyte strings
according to the current locale. The `r' means restartable: they
work incrementally and pick up where they left off.

NOTE: This bumps the libc minor version, since it adds new symbols.

PR lib/52374: <uchar.h> missing
 1.16 15-Aug-2024  riastradh uchar.h: New header file for C11 (and C++11) compliance.

Implementation of the new functions mbrtoc16, c16rtomb, mbrtoc32, and
c32rtomb to come later. Updates for C23 to come later.

PR lib/52374: <uchar.h> missing
 1.15 14-Aug-2024  riastradh tests/lib/libc/locale/Makefile: Sort.

No functional change intended.

Preparation for PR lib/52374.
 1.14 27-Nov-2023  christos branches: 1.14.2;
Don't use fmtcheck for strfmon format strings. It does not work. Fix a broken
test.
 1.13 28-Jul-2019  christos branches: 1.13.10;
PR/54414: Valery Ushakov: add a test for wcsrtombs(3) doesn't update the
source argument on conversion error
 1.12 16-Aug-2017  joerg branches: 1.12.4;
Add missing strfmon_l. Noticed by Bruno Haible. Add test case.
 1.11 23-Jul-2017  perseant Add missing files from last commit:

Move Unicode <-> ku/ten mapping into the individual codec modules.
Mapping is based on existing iconv data for single-byte encodings,
and included for several, but not all, multibyte encodings.
 1.10 14-Jul-2017  perseant branches: 1.10.2;
Add a simple collation test. This test is expected to fail on HEAD since
we do not yet have a working implementation of wcscoll.
 1.9 01-Jun-2017  perseant branches: 1.9.2;
Add tests for btowc(3)/wctob(3) and enable compilation of the test for
digittoint(3).

The digittoint(3) test is skipped since we don't provide that function yet.

One of the test cases for btowc(3) is also skipped, since it tests conversion
to Unicode---whereas our wchar_t representation is locale-dependent.
 1.8 30-May-2017  perseant Add test cases for sprintf/sscanf/strto{d,l} and the is* and isw* ctype functions, for single-byte encodings
 1.7 30-May-2017  perseant Add simple test case for toupper/tolower
 1.6 28-May-2013  joerg Add mbsnrtowcs and wcsnrtombs. Approved by core.
 1.5 28-Feb-2013  christos regression tests for wide char i/o. Currently there are failures.
 1.4 21-Nov-2011  joerg branches: 1.4.6;
Add test cases for strcspn, strpbrk, strspn, wcscspn, wcspbrk and
wcsspn.
 1.3 15-Jul-2011  jruoho branches: 1.3.2;
Rename two test files to get functional scope (and avoid confusion
with ctype(3)). No functional change.
 1.2 11-Apr-2011  tron Fix build with stack smash protection enabled.
 1.1 09-Apr-2011  pgoyette atf-ify the various locale tests
 1.3.2.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.3.2.1 17-Apr-2012  yamt sync with head
 1.4.6.1 23-Jun-2013  tls resync from head
 1.9.2.1 29-Aug-2017  martin Pull up following revision(s) (requested by joerg in ticket #215):
tests/lib/libc/locale/t_strfmon.c: revision 1.1
tests/lib/libc/locale/Makefile: revision 1.12
lib/libc/stdlib/strfmon.c: revision 1.11
distrib/sets/lists/debug/mi: revision 1.224
include/monetary.h: revision 1.3
distrib/sets/lists/tests/mi: revision 1.761
lib/libc/stdlib/strfmon.3: revision 1.6
lib/libc/stdlib/strfmon.3: revision 1.7
Add missing strfmon_l. Noticed by Bruno Haible. Add test case.
Typo fix.
 1.10.2.2 23-Jul-2017  perseant Add Unicode copyright notice and more verbose DUCET test.
 1.10.2.1 14-Jul-2017  perseant Initial commit of a mostly-working implementation of __STDC_ISO_10646__,
with collation support using the Unicode Collation Algorithm.

The conversion from men/ku/ten form to Unicode is a gross hack at present.
Fixing this, and fleshing out the LC_COLLATE locale component, are next
on the agenda.
 1.12.4.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.13.10.1 14-Oct-2024  martin Pull up following revision(s) (requested by riastradh in ticket #976):

lib/libc/locale/c32rtomb.3: revision 1.10
lib/libc/locale/c32rtomb.3: revision 1.9
lib/libc/locale/c32rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc32.c: revision 1.1
distrib/sets/lists/base/shl.mi: revision 1.988
lib/libc/include/namespace.h: revision 1.204
lib/libc/include/namespace.h: revision 1.205
lib/libc/locale/mbrtoc16.3: revision 1.1
lib/libc/locale/mbrtoc16.c: revision 1.1
lib/libc/locale/mbrtoc16.3: revision 1.2
lib/libc/locale/mbrtoc16.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.3
lib/libc/locale/mbrtoc16.c: revision 1.3
lib/libc/locale/mbrtoc32.3: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.1
tests/lib/libc/locale/t_c16rtomb.c: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.4
lib/libc/locale/mbrtoc16.c: revision 1.4
lib/libc/locale/mbrtoc32.3: revision 1.2
tests/lib/libc/locale/t_c16rtomb.c: revision 1.2
lib/libc/locale/mbrtoc32.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.5
lib/libc/locale/mbrtoc16.c: revision 1.5
lib/libc/locale/mbrtoc32.3: revision 1.3
tests/lib/libc/locale/t_c16rtomb.c: revision 1.3
lib/libc/locale/mbrtoc32.c: revision 1.4
lib/libc/locale/mbrtoc16.3: revision 1.6
lib/libc/locale/mbrtoc16.c: revision 1.6
lib/libc/locale/mbrtoc32.3: revision 1.4
tests/lib/libc/locale/t_c16rtomb.c: revision 1.4
lib/libc/locale/mbrtoc32.c: revision 1.5
lib/libc/locale/mbrtoc16.3: revision 1.7
lib/libc/locale/mbrtoc16.c: revision 1.7
lib/libc/locale/mbrtoc32.3: revision 1.5
tests/lib/libc/locale/t_c16rtomb.c: revision 1.5
lib/libc/locale/mbrtoc32.c: revision 1.6
lib/libc/locale/mbrtoc16.3: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.6
tests/lib/libc/locale/t_c16rtomb.c: revision 1.6
lib/libc/locale/mbrtoc32.c: revision 1.7
lib/libc/locale/mbrtoc16.3: revision 1.9
lib/libc/locale/mbrtoc32.3: revision 1.7
lib/libc/locale/mbrtoc32.c: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.8
lib/libc/locale/mbrtoc32.c: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2468
lib/libc/locale/mbrtoc32.3: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2469
lib/libc/locale/c32rtomb.h: revision 1.1
lib/libc/locale/c32rtomb.h: revision 1.2
include/Makefile: revision 1.147
share/man/man3/uchar.3: revision 1.1
share/man/man3/uchar.3: revision 1.2
tests/lib/libc/locale/t_c32rtomb.c: revision 1.1
distrib/sets/lists/comp/mi: revision 1.2470
lib/libc/locale/c16rtomb.3: revision 1.1
lib/libc/locale/c16rtomb.c: revision 1.1
lib/libc/locale/c16rtomb.3: revision 1.2
lib/libc/locale/c16rtomb.c: revision 1.2
lib/libc/locale/c16rtomb.3: revision 1.3
lib/libc/locale/c16rtomb.c: revision 1.3
lib/libc/locale/c16rtomb.3: revision 1.4
lib/libc/locale/c16rtomb.c: revision 1.4
lib/libc/locale/c16rtomb.3: revision 1.5
lib/libc/locale/c16rtomb.c: revision 1.5
lib/libc/locale/c16rtomb.3: revision 1.6
lib/libc/locale/c16rtomb.c: revision 1.6
lib/libc/locale/c16rtomb.3: revision 1.7
lib/libc/locale/c16rtomb.c: revision 1.7
lib/libc/locale/c16rtomb.3: revision 1.8
lib/libc/locale/c16rtomb.3: revision 1.9
distrib/sets/lists/tests/mi: revision 1.1330
distrib/sets/lists/tests/mi: revision 1.1331
distrib/sets/lists/tests/mi: revision 1.1332
tests/lib/libc/locale/t_uchar.c: revision 1.1
tests/lib/libc/locale/t_uchar.c: revision 1.2
tests/lib/libc/locale/t_uchar.c: revision 1.3
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.3
include/uchar.h: revision 1.1
include/uchar.h: revision 1.2
include/uchar.h: revision 1.3
include/uchar.h: revision 1.4
include/uchar.h: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.1
include/uchar.h: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.2
tests/lib/libc/locale/t_c8rtomb.c: revision 1.3
tests/lib/libc/locale/t_c8rtomb.c: revision 1.4
share/man/man3/Makefile: revision 1.93
tests/lib/libc/locale/t_c8rtomb.c: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.7
lib/libc/shlib_version: revision 1.297
lib/libc/locale/c16rtomb.3: revision 1.10
lib/libc/locale/c16rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.10
tests/lib/libc/locale/Makefile: revision 1.15
tests/lib/libc/locale/Makefile: revision 1.16
tests/lib/libc/locale/Makefile: revision 1.17
tests/lib/libc/locale/Makefile: revision 1.18
distrib/sets/lists/debug/mi: revision 1.442
distrib/sets/lists/debug/mi: revision 1.443
distrib/sets/lists/debug/mi: revision 1.444
lib/libc/locale/c8rtomb.3: revision 1.1
lib/libc/locale/c8rtomb.c: revision 1.1
lib/libc/locale/c8rtomb.3: revision 1.2
lib/libc/locale/c8rtomb.c: revision 1.2
lib/libc/locale/c8rtomb.3: revision 1.3
lib/libc/locale/c8rtomb.c: revision 1.3
lib/libc/locale/c8rtomb.3: revision 1.4
lib/libc/locale/c8rtomb.c: revision 1.4
lib/libc/locale/c8rtomb.3: revision 1.5
lib/libc/locale/c8rtomb.c: revision 1.5
lib/libc/locale/c8rtomb.3: revision 1.6
lib/libc/locale/c8rtomb.c: revision 1.6
lib/libc/locale/c8rtomb.3: revision 1.7
lib/libc/locale/c8rtomb.3: revision 1.8
lib/libc/locale/c8rtomb.3: revision 1.9
lib/libc/locale/mbrtoc32.h: revision 1.1
lib/libc/locale/mbrtoc32.h: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.1
lib/libc/locale/mbrtoc8.3: revision 1.1
lib/libc/locale/mbrtoc8.c: revision 1.2
lib/libc/locale/mbrtoc8.3: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc8.3: revision 1.3
lib/libc/locale/mbrtoc8.c: revision 1.4
lib/libc/locale/mbrtoc8.3: revision 1.4
lib/libc/locale/Makefile.inc: revision 1.66
lib/libc/locale/mbrtoc8.c: revision 1.5
lib/libc/locale/mbrtoc8.3: revision 1.5
lib/libc/locale/Makefile.inc: revision 1.67
lib/libc/locale/mbrtoc8.c: revision 1.6
lib/libc/locale/mbrtoc8.3: revision 1.6
lib/libc/locale/mbrtoc8.c: revision 1.7
lib/libc/locale/mbrtoc8.3: revision 1.7
lib/libc/locale/mbrtoc8.c: revision 1.8
lib/libc/locale/c32rtomb.3: revision 1.1
lib/libc/locale/c32rtomb.c: revision 1.1
lib/libc/locale/c32rtomb.3: revision 1.2
lib/libc/locale/c32rtomb.c: revision 1.2
lib/libc/locale/c32rtomb.3: revision 1.3
lib/libc/locale/c32rtomb.c: revision 1.3
lib/libc/locale/c32rtomb.3: revision 1.4
lib/libc/locale/c32rtomb.c: revision 1.4
lib/libc/locale/c32rtomb.3: revision 1.5
lib/libc/locale/c32rtomb.c: revision 1.5
lib/libc/locale/c32rtomb.3: revision 1.6
lib/libc/locale/c32rtomb.c: revision 1.6
lib/libc/locale/c32rtomb.3: revision 1.7
lib/libc/locale/c32rtomb.3: revision 1.8

(all via patch)


tests/lib/libc/locale/Makefile: Sort.
No functional change intended.
Preparation for PR lib/52374.

uchar.h: New header file for C11 (and C++11) compliance.

Implementation of the new functions mbrtoc16, c16rtomb, mbrtoc32, and
c32rtomb to come later. Updates for C23 to come later.
PR lib/52374: <uchar.h> missing

libc: New C11 functions mbrtoc16, mbrtoc32, c16rtomb, c32rtomb.

The mbrtoc16/32 functions read mulitbyte strings according to the
current locale into UTF-16/32 code unit sequences; the c16/32rtomb
functions write UTF-16/32 code unit sequences into multibyte strings
according to the current locale. The `r' means restartable: they
work incrementally and pick up where they left off.

NOTE: This bumps the libc minor version, since it adds new symbols.

PR lib/52374: <uchar.h> missing
mbrtoc16(3), mbrtoc32(3): Fix \n in man page examples.
Need to write \en to pacify roff.
PR lib/52374: <uchar.h> missing

c16rtomb(3), c32rtomb(3): Fix more \n in man pages.
Also, tighten an assertion: we left room for a NUL byte at the end.
PR lib/52374: <uchar.h> missing

libc: Use the more idiomatic alignof from stdalign.h.
No functional change intended.
PR lib/52374: <uchar.h> missing

mbrtoc16(3): Simplify surrogate state test.

Turn the finer-grained test into an assertion.
No semantic change intended: we are supposed to control this state,
and we always arrange it this way. (But in principle this could
change the behaviour of buggy programs that violate the mbstate_t
abstraction.)
PR lib/52374: <uchar.h> missing

libc: New functions c8rtomb(3) and mbrtoc8(3).

New in C23, for converting from UTF-8 to locale-dependent multibyte
sequences (c8rtomb) or vice versa (mbrtoc8), along with the new type
char8_t.

Conditional on either:
- _NETBSD_SOURCE
- _ISOC23_SOURCE
- __STDC_VERSION__ >= 202311L
(Riding the libc minor bump from this morning for the UTF-16/UTF-32
versions from C11.)

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
libc: c32rtomb and mbrtoc32 are used internally, so weak-alias them.
PR lib/52374: <uchar.h> missing
c8rtomb(3), mbrtoc8(3): Use namespace.h to get private aliases.

This way applications defining the symbols c32rtomb or mbrtoc32 won't
clobber our private definitions, which are slightly more constrained
about their use of mbstate_t than is obvious from the interface
contract.

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
mbrtoc16(3), mbrtoc32(3): brush up markup

Split long .Fn lines into Fo/Fa/Fc. Dont indent the list of return
values. Don't use artisanal -width.

Untabify code examples - indented literal displays don't have correct
tab stops consistent with tab stops in the fixed font code, so the
lines end up misaligned in the PostScript output.

c16rtomb(3), c32rtomb(3): brush up markup

mbrtoc16(3), mbrtoc32(3): Simplify return value language.
Also expand BMP only once.
PR lib/52374: <uchar.h> missing

mbrtoc16(3), mbrtoc32(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc32(3): Clarify control flow.
No need for another goto here; let's keep it clearly structured with
a single `out' label.
No functional change intended.
PR lib/52374: <uchar.h> missing

c8rtomb(3), mbrtoc8(3): brush up markup

mbrtoc8(3): Simplify return value language.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Specify what happens if ps is null.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Specify what happens when ps is null.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Work on deturgidifying prose.
Still maybe not great but at least there's less jargon in most of the
text, without really losing any content.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Work on deturgidifying prose.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Restore word accidentally removed.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Restore word accidentally removed.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c8rtomb(3): Fix possible error descriptions.
The argument c8 can't be a surrogate code point itself (they're in
the range [0xd800,0xdfff], beyond 8-bit values), but the bits of a
surrogate code point could be forced into the UTF-8 format, which is
also invalid.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Attempt a deturgidification pass.
Limit the jargon around surrogates.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Clarify prose and fix example in caveat.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
c16rtomb(3), c32rtomb(3), mbrtoc16(3), mbrtoc32(3): xref c8 versions.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

pass lint, XXX see lint bug.

libc: Add _l variants of the cNrtomb and mbrtocN functions.
These accept an explicit locale parameter, rather than using the
current locale.
Visible under _NETBSD_SOURCE, not exposed otherwise.
NOTE: This adds libc symbols. Riding the libc minor bump for the
non-_l variants of these from two days ago -- hope that's not pushing
it too far.
PR lib/58613: c*rtomb, mbrtoc* should have locale-parametric _l
variants

c8rtomb(3), c16rtomb(3): Add tests for incomplete NUL termination.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3): Fix NUL handling.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3), c32rtomb(3): Test stateful shift sequences.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Fix digit error in shift sequence test.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Nix __CTASSERT after case label.
I put this in to make it (machine-verifiably) clear that zeroing the
state is the same as returning to the initial conversion state, as
the standard requires, but this is causing build trouble (and will
likely cause more trouble if pulled up) because some definitions of
__CTASSERT make a declaration which is forbidden after a label, so
let's remove it.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8(3): Fix pasto in comment at top.
No functional change intended.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8: remove lint-specific workarounds
No binary change.

mbrtoc8: fix comments

mbrtoc16, mbrtoc32: fix comments, remove lint-specific workarounds
No binary change.
t_c8rtomb, t_c16rtomb: Simplify comment.
ESC $ B is technically rather the JIS X 0208-1983 shift sequence, but
since I don't see any way to provoke the JIS X 0208-1978 shift
sequence to come flying out of this conversion (ESC $ @), and I'm not
sure there's any difference in the interpretation, let's just say JIS
X 0208.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c32rtomb(3): Use conversion state to handle shift sequences.
For conversion of Unicode scalar values to coding systems requiring
shift sequences, such as ISO-2022-JP, _citrus_iconv_convert will
always produce:
1. a shift sequence from the initial state to some nondefault state,
like from US-ASCII to JIS X 0208
2. the encoding of the desired characater
3. a shift sequence restoring the initial state
This is unnecessary if the output is already in the state needed to
encoded the desired character. For example, this method produces
seven bytes to encode each YEN SIGN in ISO-2022-JP -- and fourteen,
to encode two consecutive ones -- even though the shift sequence is
only three bytes long and once shifted YEN SIGN takes only one byte.
Instead, convert the Unicode scalar value to a locale-dependent wide
character and encode that, by composing
- _citrus_iconv_convert
=> gives us a multibyte encoding of the character from the initial
state (and restoring the initial state afterward)
- mbrtowc with initial conversion state
=> gives us the single wide character representation
XXX If combining characters are possible here, this may fail.
- wcrtomb with caller's conversion tsate
=> gives us a state-dependent multibyte encoding of the character
XXX Is there a cheaper way to convert from Unicode scalar value to
locale-dependent wide character? It is not obvious to me from the
largely undocumented Citrus machinery, but it would obviously be
better than this somewhat circuitous Rube Goldberg contraption of
chained multibyte APIs.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

mbrtoc8(3), mbrtoc16(3): Test consuming shift sequences with state.
This has the side effect of testing mbrtoc32(3) because they are both
defined in terms of it.
PR lib/58618: mbrtocN(3) fails to keep shift state

c8rtomb(3), c16rtomb(3), c32rtomb(3): Suggest MB_LEN_MAX in example.
This way it avoids variable-length arrays, by always allocating the
maximum space that could be occupied by MB_CUR_MAX.

mbrtoc32(3): Use conversion state to handle shift sequences.
PR lib/58618: mbrtocN(3) fails to keep shift state

mbrtoc32(3): Fix name and type of mbrtowc_l return value.
This was from `int mbtowc_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to mbrtowc_l. Caught by
lint.
`mb_len' avoids (harmless) clash with standard C function mblen(3).
PR lib/58618: mbrtocN(3) fails to keep shift state

c32rtomb(3): Fix type of wcrtomb_l return value.
This was from `int wctomb_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to wcrtomb_l. Caught by
lint.
`wc_len' mirrors `mb_len' in the complementary code in mbrtoc32(3) to
avoid clash with standard C function mblen(3).
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3), c16rtomb(3), c32rtomb(3): Attempt to simplify language.

c8rtomb(3), c16rtomb(3), c32rtomb(3): Fix null string output case.
This ignores c8/c16/c32, produces no output anywhere, and just resets
ps to the initial conversion state.
Also just use 0 in the example, not '\0' or L'\0'. This works for
C11, which prefers '\0' and L'\0', for and C23, which introduced the
new u8'\0', u'\0' (UTF-16), and U'\0' (UTF-32).
c16rtomb, c32rtomb, mbrtoc8: fix page numbers in comments
mbrtoc8(3), mbrtoc16(3), mbrtoc32(3): Say 0 for zero code unit.
Rather than deal with differences between C11 and C23 in notation,
'\0' vs L'\0' vs u8'\0' vs u'\0' vs U'\0'.
uchar.h: Include <sys/featuretest.h> before testing _*_SOURCE.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

uchar.h: Need <sys/cdefs.h> for __restrict.
PR lib/52374: <uchar.h> missing

uchar.h: Simplify __cpp_char8_t and __cplusplus conditionals.
No functional change intended.
PR lib/52374: <uchar.h> missing

tests/lib/libc/locale/t_uchar: Test for char8_t, mbrtoc8, c8rtomb.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

tests/t_uchar: fix copy-and-paste typo
 1.14.2.1 02-Aug-2025  perseant Sync with HEAD
 1.2 23-Jul-2017  perseant Add missing files from last commit:

Move Unicode <-> ku/ten mapping into the individual codec modules.
Mapping is based on existing iconv data for single-byte encodings,
and included for several, but not all, multibyte encodings.
 1.1 14-Jul-2017  perseant branches: 1.1.2;
file ducet_test.h was initially added on branch perseant-stdc-iso10646.
 1.1.2.2 23-Jul-2017  perseant Add Unicode copyright notice and more verbose DUCET test.
 1.1.2.1 14-Jul-2017  perseant Initial commit of a mostly-working implementation of __STDC_ISO_10646__,
with collation support using the Unicode Collation Algorithm.

The conversion from men/ku/ten form to Unicode is a gross hack at present.
Fixing this, and fleshing out the LC_COLLATE locale component, are next
on the agenda.
 1.3 10-Aug-2017  perseant Separate the C/POSIX locale test from the rest; make it more thorough
and more correct. This fixes a problem reported by martin@ when the
test is compiled with -funsigned-char.
 1.2 12-Jul-2017  perseant Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.
 1.1 01-Jun-2017  perseant branches: 1.1.2;
Add tests for btowc(3)/wctob(3) and enable compilation of the test for
digittoint(3).

The digittoint(3) test is skipped since we don't provide that function yet.

One of the test cases for btowc(3) is also skipped, since it tests conversion
to Unicode---whereas our wchar_t representation is locale-dependent.
 1.1.2.1 15-Mar-2018  martin Pull up following revision(s) (requested by maya in ticket #608):
tests/lib/libc/locale/t_sprintf.c: revision 1.3
tests/lib/libc/locale/t_wctomb.c: revision 1.5
tests/lib/libc/locale/t_io.c: revision 1.5
tests/lib/libc/locale/t_wcstod.c: revision 1.4
tests/lib/libc/locale/t_mbstowcs.c: revision 1.2
tests/lib/libc/locale/t_wctype.c: revision 1.2
tests/lib/libc/locale/t_mbrtowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.3
Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.

Separate the C/POSIX locale test from the rest; make it more thorough
and more correct. This fixes a problem reported by martin@ when the
test is compiled with -funsigned-char.
 1.6 19-Aug-2024  riastradh branches: 1.6.2; 1.6.6;
c32rtomb(3): Use conversion state to handle shift sequences.

For conversion of Unicode scalar values to coding systems requiring
shift sequences, such as ISO-2022-JP, _citrus_iconv_convert will
always produce:

1. a shift sequence from the initial state to some nondefault state,
like from US-ASCII to JIS X 0208
2. the encoding of the desired characater
3. a shift sequence restoring the initial state

This is unnecessary if the output is already in the state needed to
encoded the desired character. For example, this method produces
seven bytes to encode each YEN SIGN in ISO-2022-JP -- and fourteen,
to encode two consecutive ones -- even though the shift sequence is
only three bytes long and once shifted YEN SIGN takes only one byte.

Instead, convert the Unicode scalar value to a locale-dependent wide
character and encode that, by composing

- _citrus_iconv_convert
=> gives us a multibyte encoding of the character from the initial
state (and restoring the initial state afterward)
- mbrtowc with initial conversion state
=> gives us the single wide character representation
XXX If combining characters are possible here, this may fail.
- wcrtomb with caller's conversion tsate
=> gives us a state-dependent multibyte encoding of the character

XXX Is there a cheaper way to convert from Unicode scalar value to
locale-dependent wide character? It is not obvious to me from the
largely undocumented Citrus machinery, but it would obviously be
better than this somewhat circuitous Rube Goldberg contraption of
chained multibyte APIs.

PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences
 1.5 19-Aug-2024  riastradh t_c8rtomb, t_c16rtomb: Simplify comment.

ESC $ B is technically rather the JIS X 0208-1983 shift sequence, but
since I don't see any way to provoke the JIS X 0208-1978 shift
sequence to come flying out of this conversion (ESC $ @), and I'm not
sure there's any difference in the interpretation, let's just say JIS
X 0208.

PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences
 1.4 18-Aug-2024  riastradh c8rtomb(3), c16rtomb(3), c32rtomb(3): Test stateful shift sequences.

PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences
 1.3 18-Aug-2024  riastradh c8rtomb(3), c16rtomb(3): Fix NUL handling.

PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong
 1.2 17-Aug-2024  riastradh c8rtomb(3), c16rtomb(3): Add tests for incomplete NUL termination.

PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong
 1.1 15-Aug-2024  riastradh libc: New C11 functions mbrtoc16, mbrtoc32, c16rtomb, c32rtomb.

The mbrtoc16/32 functions read mulitbyte strings according to the
current locale into UTF-16/32 code unit sequences; the c16/32rtomb
functions write UTF-16/32 code unit sequences into multibyte strings
according to the current locale. The `r' means restartable: they
work incrementally and pick up where they left off.

NOTE: This bumps the libc minor version, since it adds new symbols.

PR lib/52374: <uchar.h> missing
 1.6.6.2 02-Aug-2025  perseant Sync with HEAD
 1.6.6.1 19-Aug-2024  perseant file t_c16rtomb.c was added on branch perseant-exfatfs on 2025-08-02 05:58:05 +0000
 1.6.2.2 14-Oct-2024  martin Pull up following revision(s) (requested by riastradh in ticket #976):

lib/libc/locale/c32rtomb.3: revision 1.10
lib/libc/locale/c32rtomb.3: revision 1.9
lib/libc/locale/c32rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc32.c: revision 1.1
distrib/sets/lists/base/shl.mi: revision 1.988
lib/libc/include/namespace.h: revision 1.204
lib/libc/include/namespace.h: revision 1.205
lib/libc/locale/mbrtoc16.3: revision 1.1
lib/libc/locale/mbrtoc16.c: revision 1.1
lib/libc/locale/mbrtoc16.3: revision 1.2
lib/libc/locale/mbrtoc16.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.3
lib/libc/locale/mbrtoc16.c: revision 1.3
lib/libc/locale/mbrtoc32.3: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.1
tests/lib/libc/locale/t_c16rtomb.c: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.4
lib/libc/locale/mbrtoc16.c: revision 1.4
lib/libc/locale/mbrtoc32.3: revision 1.2
tests/lib/libc/locale/t_c16rtomb.c: revision 1.2
lib/libc/locale/mbrtoc32.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.5
lib/libc/locale/mbrtoc16.c: revision 1.5
lib/libc/locale/mbrtoc32.3: revision 1.3
tests/lib/libc/locale/t_c16rtomb.c: revision 1.3
lib/libc/locale/mbrtoc32.c: revision 1.4
lib/libc/locale/mbrtoc16.3: revision 1.6
lib/libc/locale/mbrtoc16.c: revision 1.6
lib/libc/locale/mbrtoc32.3: revision 1.4
tests/lib/libc/locale/t_c16rtomb.c: revision 1.4
lib/libc/locale/mbrtoc32.c: revision 1.5
lib/libc/locale/mbrtoc16.3: revision 1.7
lib/libc/locale/mbrtoc16.c: revision 1.7
lib/libc/locale/mbrtoc32.3: revision 1.5
tests/lib/libc/locale/t_c16rtomb.c: revision 1.5
lib/libc/locale/mbrtoc32.c: revision 1.6
lib/libc/locale/mbrtoc16.3: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.6
tests/lib/libc/locale/t_c16rtomb.c: revision 1.6
lib/libc/locale/mbrtoc32.c: revision 1.7
lib/libc/locale/mbrtoc16.3: revision 1.9
lib/libc/locale/mbrtoc32.3: revision 1.7
lib/libc/locale/mbrtoc32.c: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.8
lib/libc/locale/mbrtoc32.c: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2468
lib/libc/locale/mbrtoc32.3: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2469
lib/libc/locale/c32rtomb.h: revision 1.1
lib/libc/locale/c32rtomb.h: revision 1.2
include/Makefile: revision 1.147
share/man/man3/uchar.3: revision 1.1
share/man/man3/uchar.3: revision 1.2
tests/lib/libc/locale/t_c32rtomb.c: revision 1.1
distrib/sets/lists/comp/mi: revision 1.2470
lib/libc/locale/c16rtomb.3: revision 1.1
lib/libc/locale/c16rtomb.c: revision 1.1
lib/libc/locale/c16rtomb.3: revision 1.2
lib/libc/locale/c16rtomb.c: revision 1.2
lib/libc/locale/c16rtomb.3: revision 1.3
lib/libc/locale/c16rtomb.c: revision 1.3
lib/libc/locale/c16rtomb.3: revision 1.4
lib/libc/locale/c16rtomb.c: revision 1.4
lib/libc/locale/c16rtomb.3: revision 1.5
lib/libc/locale/c16rtomb.c: revision 1.5
lib/libc/locale/c16rtomb.3: revision 1.6
lib/libc/locale/c16rtomb.c: revision 1.6
lib/libc/locale/c16rtomb.3: revision 1.7
lib/libc/locale/c16rtomb.c: revision 1.7
lib/libc/locale/c16rtomb.3: revision 1.8
lib/libc/locale/c16rtomb.3: revision 1.9
distrib/sets/lists/tests/mi: revision 1.1330
distrib/sets/lists/tests/mi: revision 1.1331
distrib/sets/lists/tests/mi: revision 1.1332
tests/lib/libc/locale/t_uchar.c: revision 1.1
tests/lib/libc/locale/t_uchar.c: revision 1.2
tests/lib/libc/locale/t_uchar.c: revision 1.3
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.3
include/uchar.h: revision 1.1
include/uchar.h: revision 1.2
include/uchar.h: revision 1.3
include/uchar.h: revision 1.4
include/uchar.h: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.1
include/uchar.h: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.2
tests/lib/libc/locale/t_c8rtomb.c: revision 1.3
tests/lib/libc/locale/t_c8rtomb.c: revision 1.4
share/man/man3/Makefile: revision 1.93
tests/lib/libc/locale/t_c8rtomb.c: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.7
lib/libc/shlib_version: revision 1.297
lib/libc/locale/c16rtomb.3: revision 1.10
lib/libc/locale/c16rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.10
tests/lib/libc/locale/Makefile: revision 1.15
tests/lib/libc/locale/Makefile: revision 1.16
tests/lib/libc/locale/Makefile: revision 1.17
tests/lib/libc/locale/Makefile: revision 1.18
distrib/sets/lists/debug/mi: revision 1.442
distrib/sets/lists/debug/mi: revision 1.443
distrib/sets/lists/debug/mi: revision 1.444
lib/libc/locale/c8rtomb.3: revision 1.1
lib/libc/locale/c8rtomb.c: revision 1.1
lib/libc/locale/c8rtomb.3: revision 1.2
lib/libc/locale/c8rtomb.c: revision 1.2
lib/libc/locale/c8rtomb.3: revision 1.3
lib/libc/locale/c8rtomb.c: revision 1.3
lib/libc/locale/c8rtomb.3: revision 1.4
lib/libc/locale/c8rtomb.c: revision 1.4
lib/libc/locale/c8rtomb.3: revision 1.5
lib/libc/locale/c8rtomb.c: revision 1.5
lib/libc/locale/c8rtomb.3: revision 1.6
lib/libc/locale/c8rtomb.c: revision 1.6
lib/libc/locale/c8rtomb.3: revision 1.7
lib/libc/locale/c8rtomb.3: revision 1.8
lib/libc/locale/c8rtomb.3: revision 1.9
lib/libc/locale/mbrtoc32.h: revision 1.1
lib/libc/locale/mbrtoc32.h: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.1
lib/libc/locale/mbrtoc8.3: revision 1.1
lib/libc/locale/mbrtoc8.c: revision 1.2
lib/libc/locale/mbrtoc8.3: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc8.3: revision 1.3
lib/libc/locale/mbrtoc8.c: revision 1.4
lib/libc/locale/mbrtoc8.3: revision 1.4
lib/libc/locale/Makefile.inc: revision 1.66
lib/libc/locale/mbrtoc8.c: revision 1.5
lib/libc/locale/mbrtoc8.3: revision 1.5
lib/libc/locale/Makefile.inc: revision 1.67
lib/libc/locale/mbrtoc8.c: revision 1.6
lib/libc/locale/mbrtoc8.3: revision 1.6
lib/libc/locale/mbrtoc8.c: revision 1.7
lib/libc/locale/mbrtoc8.3: revision 1.7
lib/libc/locale/mbrtoc8.c: revision 1.8
lib/libc/locale/c32rtomb.3: revision 1.1
lib/libc/locale/c32rtomb.c: revision 1.1
lib/libc/locale/c32rtomb.3: revision 1.2
lib/libc/locale/c32rtomb.c: revision 1.2
lib/libc/locale/c32rtomb.3: revision 1.3
lib/libc/locale/c32rtomb.c: revision 1.3
lib/libc/locale/c32rtomb.3: revision 1.4
lib/libc/locale/c32rtomb.c: revision 1.4
lib/libc/locale/c32rtomb.3: revision 1.5
lib/libc/locale/c32rtomb.c: revision 1.5
lib/libc/locale/c32rtomb.3: revision 1.6
lib/libc/locale/c32rtomb.c: revision 1.6
lib/libc/locale/c32rtomb.3: revision 1.7
lib/libc/locale/c32rtomb.3: revision 1.8

(all via patch)


tests/lib/libc/locale/Makefile: Sort.
No functional change intended.
Preparation for PR lib/52374.

uchar.h: New header file for C11 (and C++11) compliance.

Implementation of the new functions mbrtoc16, c16rtomb, mbrtoc32, and
c32rtomb to come later. Updates for C23 to come later.
PR lib/52374: <uchar.h> missing

libc: New C11 functions mbrtoc16, mbrtoc32, c16rtomb, c32rtomb.

The mbrtoc16/32 functions read mulitbyte strings according to the
current locale into UTF-16/32 code unit sequences; the c16/32rtomb
functions write UTF-16/32 code unit sequences into multibyte strings
according to the current locale. The `r' means restartable: they
work incrementally and pick up where they left off.

NOTE: This bumps the libc minor version, since it adds new symbols.

PR lib/52374: <uchar.h> missing
mbrtoc16(3), mbrtoc32(3): Fix \n in man page examples.
Need to write \en to pacify roff.
PR lib/52374: <uchar.h> missing

c16rtomb(3), c32rtomb(3): Fix more \n in man pages.
Also, tighten an assertion: we left room for a NUL byte at the end.
PR lib/52374: <uchar.h> missing

libc: Use the more idiomatic alignof from stdalign.h.
No functional change intended.
PR lib/52374: <uchar.h> missing

mbrtoc16(3): Simplify surrogate state test.

Turn the finer-grained test into an assertion.
No semantic change intended: we are supposed to control this state,
and we always arrange it this way. (But in principle this could
change the behaviour of buggy programs that violate the mbstate_t
abstraction.)
PR lib/52374: <uchar.h> missing

libc: New functions c8rtomb(3) and mbrtoc8(3).

New in C23, for converting from UTF-8 to locale-dependent multibyte
sequences (c8rtomb) or vice versa (mbrtoc8), along with the new type
char8_t.

Conditional on either:
- _NETBSD_SOURCE
- _ISOC23_SOURCE
- __STDC_VERSION__ >= 202311L
(Riding the libc minor bump from this morning for the UTF-16/UTF-32
versions from C11.)

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
libc: c32rtomb and mbrtoc32 are used internally, so weak-alias them.
PR lib/52374: <uchar.h> missing
c8rtomb(3), mbrtoc8(3): Use namespace.h to get private aliases.

This way applications defining the symbols c32rtomb or mbrtoc32 won't
clobber our private definitions, which are slightly more constrained
about their use of mbstate_t than is obvious from the interface
contract.

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
mbrtoc16(3), mbrtoc32(3): brush up markup

Split long .Fn lines into Fo/Fa/Fc. Dont indent the list of return
values. Don't use artisanal -width.

Untabify code examples - indented literal displays don't have correct
tab stops consistent with tab stops in the fixed font code, so the
lines end up misaligned in the PostScript output.

c16rtomb(3), c32rtomb(3): brush up markup

mbrtoc16(3), mbrtoc32(3): Simplify return value language.
Also expand BMP only once.
PR lib/52374: <uchar.h> missing

mbrtoc16(3), mbrtoc32(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc32(3): Clarify control flow.
No need for another goto here; let's keep it clearly structured with
a single `out' label.
No functional change intended.
PR lib/52374: <uchar.h> missing

c8rtomb(3), mbrtoc8(3): brush up markup

mbrtoc8(3): Simplify return value language.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Specify what happens if ps is null.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Specify what happens when ps is null.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Work on deturgidifying prose.
Still maybe not great but at least there's less jargon in most of the
text, without really losing any content.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Work on deturgidifying prose.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Restore word accidentally removed.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Restore word accidentally removed.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c8rtomb(3): Fix possible error descriptions.
The argument c8 can't be a surrogate code point itself (they're in
the range [0xd800,0xdfff], beyond 8-bit values), but the bits of a
surrogate code point could be forced into the UTF-8 format, which is
also invalid.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Attempt a deturgidification pass.
Limit the jargon around surrogates.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Clarify prose and fix example in caveat.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
c16rtomb(3), c32rtomb(3), mbrtoc16(3), mbrtoc32(3): xref c8 versions.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

pass lint, XXX see lint bug.

libc: Add _l variants of the cNrtomb and mbrtocN functions.
These accept an explicit locale parameter, rather than using the
current locale.
Visible under _NETBSD_SOURCE, not exposed otherwise.
NOTE: This adds libc symbols. Riding the libc minor bump for the
non-_l variants of these from two days ago -- hope that's not pushing
it too far.
PR lib/58613: c*rtomb, mbrtoc* should have locale-parametric _l
variants

c8rtomb(3), c16rtomb(3): Add tests for incomplete NUL termination.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3): Fix NUL handling.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3), c32rtomb(3): Test stateful shift sequences.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Fix digit error in shift sequence test.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Nix __CTASSERT after case label.
I put this in to make it (machine-verifiably) clear that zeroing the
state is the same as returning to the initial conversion state, as
the standard requires, but this is causing build trouble (and will
likely cause more trouble if pulled up) because some definitions of
__CTASSERT make a declaration which is forbidden after a label, so
let's remove it.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8(3): Fix pasto in comment at top.
No functional change intended.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8: remove lint-specific workarounds
No binary change.

mbrtoc8: fix comments

mbrtoc16, mbrtoc32: fix comments, remove lint-specific workarounds
No binary change.
t_c8rtomb, t_c16rtomb: Simplify comment.
ESC $ B is technically rather the JIS X 0208-1983 shift sequence, but
since I don't see any way to provoke the JIS X 0208-1978 shift
sequence to come flying out of this conversion (ESC $ @), and I'm not
sure there's any difference in the interpretation, let's just say JIS
X 0208.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c32rtomb(3): Use conversion state to handle shift sequences.
For conversion of Unicode scalar values to coding systems requiring
shift sequences, such as ISO-2022-JP, _citrus_iconv_convert will
always produce:
1. a shift sequence from the initial state to some nondefault state,
like from US-ASCII to JIS X 0208
2. the encoding of the desired characater
3. a shift sequence restoring the initial state
This is unnecessary if the output is already in the state needed to
encoded the desired character. For example, this method produces
seven bytes to encode each YEN SIGN in ISO-2022-JP -- and fourteen,
to encode two consecutive ones -- even though the shift sequence is
only three bytes long and once shifted YEN SIGN takes only one byte.
Instead, convert the Unicode scalar value to a locale-dependent wide
character and encode that, by composing
- _citrus_iconv_convert
=> gives us a multibyte encoding of the character from the initial
state (and restoring the initial state afterward)
- mbrtowc with initial conversion state
=> gives us the single wide character representation
XXX If combining characters are possible here, this may fail.
- wcrtomb with caller's conversion tsate
=> gives us a state-dependent multibyte encoding of the character
XXX Is there a cheaper way to convert from Unicode scalar value to
locale-dependent wide character? It is not obvious to me from the
largely undocumented Citrus machinery, but it would obviously be
better than this somewhat circuitous Rube Goldberg contraption of
chained multibyte APIs.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

mbrtoc8(3), mbrtoc16(3): Test consuming shift sequences with state.
This has the side effect of testing mbrtoc32(3) because they are both
defined in terms of it.
PR lib/58618: mbrtocN(3) fails to keep shift state

c8rtomb(3), c16rtomb(3), c32rtomb(3): Suggest MB_LEN_MAX in example.
This way it avoids variable-length arrays, by always allocating the
maximum space that could be occupied by MB_CUR_MAX.

mbrtoc32(3): Use conversion state to handle shift sequences.
PR lib/58618: mbrtocN(3) fails to keep shift state

mbrtoc32(3): Fix name and type of mbrtowc_l return value.
This was from `int mbtowc_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to mbrtowc_l. Caught by
lint.
`mb_len' avoids (harmless) clash with standard C function mblen(3).
PR lib/58618: mbrtocN(3) fails to keep shift state

c32rtomb(3): Fix type of wcrtomb_l return value.
This was from `int wctomb_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to wcrtomb_l. Caught by
lint.
`wc_len' mirrors `mb_len' in the complementary code in mbrtoc32(3) to
avoid clash with standard C function mblen(3).
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3), c16rtomb(3), c32rtomb(3): Attempt to simplify language.

c8rtomb(3), c16rtomb(3), c32rtomb(3): Fix null string output case.
This ignores c8/c16/c32, produces no output anywhere, and just resets
ps to the initial conversion state.
Also just use 0 in the example, not '\0' or L'\0'. This works for
C11, which prefers '\0' and L'\0', for and C23, which introduced the
new u8'\0', u'\0' (UTF-16), and U'\0' (UTF-32).
c16rtomb, c32rtomb, mbrtoc8: fix page numbers in comments
mbrtoc8(3), mbrtoc16(3), mbrtoc32(3): Say 0 for zero code unit.
Rather than deal with differences between C11 and C23 in notation,
'\0' vs L'\0' vs u8'\0' vs u'\0' vs U'\0'.
uchar.h: Include <sys/featuretest.h> before testing _*_SOURCE.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

uchar.h: Need <sys/cdefs.h> for __restrict.
PR lib/52374: <uchar.h> missing

uchar.h: Simplify __cpp_char8_t and __cplusplus conditionals.
No functional change intended.
PR lib/52374: <uchar.h> missing

tests/lib/libc/locale/t_uchar: Test for char8_t, mbrtoc8, c8rtomb.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

tests/t_uchar: fix copy-and-paste typo
 1.6.2.1 19-Aug-2024  martin file t_c16rtomb.c was added on branch netbsd-10 on 2024-10-14 17:20:19 +0000
 1.1 15-Aug-2024  riastradh branches: 1.1.2; 1.1.6;
libc: New C11 functions mbrtoc16, mbrtoc32, c16rtomb, c32rtomb.

The mbrtoc16/32 functions read mulitbyte strings according to the
current locale into UTF-16/32 code unit sequences; the c16/32rtomb
functions write UTF-16/32 code unit sequences into multibyte strings
according to the current locale. The `r' means restartable: they
work incrementally and pick up where they left off.

NOTE: This bumps the libc minor version, since it adds new symbols.

PR lib/52374: <uchar.h> missing
 1.1.6.2 02-Aug-2025  perseant Sync with HEAD
 1.1.6.1 15-Aug-2024  perseant file t_c32rtomb.c was added on branch perseant-exfatfs on 2025-08-02 05:58:05 +0000
 1.1.2.2 14-Oct-2024  martin Pull up following revision(s) (requested by riastradh in ticket #976):

lib/libc/locale/c32rtomb.3: revision 1.10
lib/libc/locale/c32rtomb.3: revision 1.9
lib/libc/locale/c32rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc32.c: revision 1.1
distrib/sets/lists/base/shl.mi: revision 1.988
lib/libc/include/namespace.h: revision 1.204
lib/libc/include/namespace.h: revision 1.205
lib/libc/locale/mbrtoc16.3: revision 1.1
lib/libc/locale/mbrtoc16.c: revision 1.1
lib/libc/locale/mbrtoc16.3: revision 1.2
lib/libc/locale/mbrtoc16.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.3
lib/libc/locale/mbrtoc16.c: revision 1.3
lib/libc/locale/mbrtoc32.3: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.1
tests/lib/libc/locale/t_c16rtomb.c: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.4
lib/libc/locale/mbrtoc16.c: revision 1.4
lib/libc/locale/mbrtoc32.3: revision 1.2
tests/lib/libc/locale/t_c16rtomb.c: revision 1.2
lib/libc/locale/mbrtoc32.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.5
lib/libc/locale/mbrtoc16.c: revision 1.5
lib/libc/locale/mbrtoc32.3: revision 1.3
tests/lib/libc/locale/t_c16rtomb.c: revision 1.3
lib/libc/locale/mbrtoc32.c: revision 1.4
lib/libc/locale/mbrtoc16.3: revision 1.6
lib/libc/locale/mbrtoc16.c: revision 1.6
lib/libc/locale/mbrtoc32.3: revision 1.4
tests/lib/libc/locale/t_c16rtomb.c: revision 1.4
lib/libc/locale/mbrtoc32.c: revision 1.5
lib/libc/locale/mbrtoc16.3: revision 1.7
lib/libc/locale/mbrtoc16.c: revision 1.7
lib/libc/locale/mbrtoc32.3: revision 1.5
tests/lib/libc/locale/t_c16rtomb.c: revision 1.5
lib/libc/locale/mbrtoc32.c: revision 1.6
lib/libc/locale/mbrtoc16.3: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.6
tests/lib/libc/locale/t_c16rtomb.c: revision 1.6
lib/libc/locale/mbrtoc32.c: revision 1.7
lib/libc/locale/mbrtoc16.3: revision 1.9
lib/libc/locale/mbrtoc32.3: revision 1.7
lib/libc/locale/mbrtoc32.c: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.8
lib/libc/locale/mbrtoc32.c: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2468
lib/libc/locale/mbrtoc32.3: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2469
lib/libc/locale/c32rtomb.h: revision 1.1
lib/libc/locale/c32rtomb.h: revision 1.2
include/Makefile: revision 1.147
share/man/man3/uchar.3: revision 1.1
share/man/man3/uchar.3: revision 1.2
tests/lib/libc/locale/t_c32rtomb.c: revision 1.1
distrib/sets/lists/comp/mi: revision 1.2470
lib/libc/locale/c16rtomb.3: revision 1.1
lib/libc/locale/c16rtomb.c: revision 1.1
lib/libc/locale/c16rtomb.3: revision 1.2
lib/libc/locale/c16rtomb.c: revision 1.2
lib/libc/locale/c16rtomb.3: revision 1.3
lib/libc/locale/c16rtomb.c: revision 1.3
lib/libc/locale/c16rtomb.3: revision 1.4
lib/libc/locale/c16rtomb.c: revision 1.4
lib/libc/locale/c16rtomb.3: revision 1.5
lib/libc/locale/c16rtomb.c: revision 1.5
lib/libc/locale/c16rtomb.3: revision 1.6
lib/libc/locale/c16rtomb.c: revision 1.6
lib/libc/locale/c16rtomb.3: revision 1.7
lib/libc/locale/c16rtomb.c: revision 1.7
lib/libc/locale/c16rtomb.3: revision 1.8
lib/libc/locale/c16rtomb.3: revision 1.9
distrib/sets/lists/tests/mi: revision 1.1330
distrib/sets/lists/tests/mi: revision 1.1331
distrib/sets/lists/tests/mi: revision 1.1332
tests/lib/libc/locale/t_uchar.c: revision 1.1
tests/lib/libc/locale/t_uchar.c: revision 1.2
tests/lib/libc/locale/t_uchar.c: revision 1.3
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.3
include/uchar.h: revision 1.1
include/uchar.h: revision 1.2
include/uchar.h: revision 1.3
include/uchar.h: revision 1.4
include/uchar.h: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.1
include/uchar.h: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.2
tests/lib/libc/locale/t_c8rtomb.c: revision 1.3
tests/lib/libc/locale/t_c8rtomb.c: revision 1.4
share/man/man3/Makefile: revision 1.93
tests/lib/libc/locale/t_c8rtomb.c: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.7
lib/libc/shlib_version: revision 1.297
lib/libc/locale/c16rtomb.3: revision 1.10
lib/libc/locale/c16rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.10
tests/lib/libc/locale/Makefile: revision 1.15
tests/lib/libc/locale/Makefile: revision 1.16
tests/lib/libc/locale/Makefile: revision 1.17
tests/lib/libc/locale/Makefile: revision 1.18
distrib/sets/lists/debug/mi: revision 1.442
distrib/sets/lists/debug/mi: revision 1.443
distrib/sets/lists/debug/mi: revision 1.444
lib/libc/locale/c8rtomb.3: revision 1.1
lib/libc/locale/c8rtomb.c: revision 1.1
lib/libc/locale/c8rtomb.3: revision 1.2
lib/libc/locale/c8rtomb.c: revision 1.2
lib/libc/locale/c8rtomb.3: revision 1.3
lib/libc/locale/c8rtomb.c: revision 1.3
lib/libc/locale/c8rtomb.3: revision 1.4
lib/libc/locale/c8rtomb.c: revision 1.4
lib/libc/locale/c8rtomb.3: revision 1.5
lib/libc/locale/c8rtomb.c: revision 1.5
lib/libc/locale/c8rtomb.3: revision 1.6
lib/libc/locale/c8rtomb.c: revision 1.6
lib/libc/locale/c8rtomb.3: revision 1.7
lib/libc/locale/c8rtomb.3: revision 1.8
lib/libc/locale/c8rtomb.3: revision 1.9
lib/libc/locale/mbrtoc32.h: revision 1.1
lib/libc/locale/mbrtoc32.h: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.1
lib/libc/locale/mbrtoc8.3: revision 1.1
lib/libc/locale/mbrtoc8.c: revision 1.2
lib/libc/locale/mbrtoc8.3: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc8.3: revision 1.3
lib/libc/locale/mbrtoc8.c: revision 1.4
lib/libc/locale/mbrtoc8.3: revision 1.4
lib/libc/locale/Makefile.inc: revision 1.66
lib/libc/locale/mbrtoc8.c: revision 1.5
lib/libc/locale/mbrtoc8.3: revision 1.5
lib/libc/locale/Makefile.inc: revision 1.67
lib/libc/locale/mbrtoc8.c: revision 1.6
lib/libc/locale/mbrtoc8.3: revision 1.6
lib/libc/locale/mbrtoc8.c: revision 1.7
lib/libc/locale/mbrtoc8.3: revision 1.7
lib/libc/locale/mbrtoc8.c: revision 1.8
lib/libc/locale/c32rtomb.3: revision 1.1
lib/libc/locale/c32rtomb.c: revision 1.1
lib/libc/locale/c32rtomb.3: revision 1.2
lib/libc/locale/c32rtomb.c: revision 1.2
lib/libc/locale/c32rtomb.3: revision 1.3
lib/libc/locale/c32rtomb.c: revision 1.3
lib/libc/locale/c32rtomb.3: revision 1.4
lib/libc/locale/c32rtomb.c: revision 1.4
lib/libc/locale/c32rtomb.3: revision 1.5
lib/libc/locale/c32rtomb.c: revision 1.5
lib/libc/locale/c32rtomb.3: revision 1.6
lib/libc/locale/c32rtomb.c: revision 1.6
lib/libc/locale/c32rtomb.3: revision 1.7
lib/libc/locale/c32rtomb.3: revision 1.8

(all via patch)


tests/lib/libc/locale/Makefile: Sort.
No functional change intended.
Preparation for PR lib/52374.

uchar.h: New header file for C11 (and C++11) compliance.

Implementation of the new functions mbrtoc16, c16rtomb, mbrtoc32, and
c32rtomb to come later. Updates for C23 to come later.
PR lib/52374: <uchar.h> missing

libc: New C11 functions mbrtoc16, mbrtoc32, c16rtomb, c32rtomb.

The mbrtoc16/32 functions read mulitbyte strings according to the
current locale into UTF-16/32 code unit sequences; the c16/32rtomb
functions write UTF-16/32 code unit sequences into multibyte strings
according to the current locale. The `r' means restartable: they
work incrementally and pick up where they left off.

NOTE: This bumps the libc minor version, since it adds new symbols.

PR lib/52374: <uchar.h> missing
mbrtoc16(3), mbrtoc32(3): Fix \n in man page examples.
Need to write \en to pacify roff.
PR lib/52374: <uchar.h> missing

c16rtomb(3), c32rtomb(3): Fix more \n in man pages.
Also, tighten an assertion: we left room for a NUL byte at the end.
PR lib/52374: <uchar.h> missing

libc: Use the more idiomatic alignof from stdalign.h.
No functional change intended.
PR lib/52374: <uchar.h> missing

mbrtoc16(3): Simplify surrogate state test.

Turn the finer-grained test into an assertion.
No semantic change intended: we are supposed to control this state,
and we always arrange it this way. (But in principle this could
change the behaviour of buggy programs that violate the mbstate_t
abstraction.)
PR lib/52374: <uchar.h> missing

libc: New functions c8rtomb(3) and mbrtoc8(3).

New in C23, for converting from UTF-8 to locale-dependent multibyte
sequences (c8rtomb) or vice versa (mbrtoc8), along with the new type
char8_t.

Conditional on either:
- _NETBSD_SOURCE
- _ISOC23_SOURCE
- __STDC_VERSION__ >= 202311L
(Riding the libc minor bump from this morning for the UTF-16/UTF-32
versions from C11.)

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
libc: c32rtomb and mbrtoc32 are used internally, so weak-alias them.
PR lib/52374: <uchar.h> missing
c8rtomb(3), mbrtoc8(3): Use namespace.h to get private aliases.

This way applications defining the symbols c32rtomb or mbrtoc32 won't
clobber our private definitions, which are slightly more constrained
about their use of mbstate_t than is obvious from the interface
contract.

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
mbrtoc16(3), mbrtoc32(3): brush up markup

Split long .Fn lines into Fo/Fa/Fc. Dont indent the list of return
values. Don't use artisanal -width.

Untabify code examples - indented literal displays don't have correct
tab stops consistent with tab stops in the fixed font code, so the
lines end up misaligned in the PostScript output.

c16rtomb(3), c32rtomb(3): brush up markup

mbrtoc16(3), mbrtoc32(3): Simplify return value language.
Also expand BMP only once.
PR lib/52374: <uchar.h> missing

mbrtoc16(3), mbrtoc32(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc32(3): Clarify control flow.
No need for another goto here; let's keep it clearly structured with
a single `out' label.
No functional change intended.
PR lib/52374: <uchar.h> missing

c8rtomb(3), mbrtoc8(3): brush up markup

mbrtoc8(3): Simplify return value language.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Specify what happens if ps is null.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Specify what happens when ps is null.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Work on deturgidifying prose.
Still maybe not great but at least there's less jargon in most of the
text, without really losing any content.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Work on deturgidifying prose.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Restore word accidentally removed.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Restore word accidentally removed.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c8rtomb(3): Fix possible error descriptions.
The argument c8 can't be a surrogate code point itself (they're in
the range [0xd800,0xdfff], beyond 8-bit values), but the bits of a
surrogate code point could be forced into the UTF-8 format, which is
also invalid.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Attempt a deturgidification pass.
Limit the jargon around surrogates.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Clarify prose and fix example in caveat.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
c16rtomb(3), c32rtomb(3), mbrtoc16(3), mbrtoc32(3): xref c8 versions.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

pass lint, XXX see lint bug.

libc: Add _l variants of the cNrtomb and mbrtocN functions.
These accept an explicit locale parameter, rather than using the
current locale.
Visible under _NETBSD_SOURCE, not exposed otherwise.
NOTE: This adds libc symbols. Riding the libc minor bump for the
non-_l variants of these from two days ago -- hope that's not pushing
it too far.
PR lib/58613: c*rtomb, mbrtoc* should have locale-parametric _l
variants

c8rtomb(3), c16rtomb(3): Add tests for incomplete NUL termination.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3): Fix NUL handling.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3), c32rtomb(3): Test stateful shift sequences.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Fix digit error in shift sequence test.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Nix __CTASSERT after case label.
I put this in to make it (machine-verifiably) clear that zeroing the
state is the same as returning to the initial conversion state, as
the standard requires, but this is causing build trouble (and will
likely cause more trouble if pulled up) because some definitions of
__CTASSERT make a declaration which is forbidden after a label, so
let's remove it.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8(3): Fix pasto in comment at top.
No functional change intended.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8: remove lint-specific workarounds
No binary change.

mbrtoc8: fix comments

mbrtoc16, mbrtoc32: fix comments, remove lint-specific workarounds
No binary change.
t_c8rtomb, t_c16rtomb: Simplify comment.
ESC $ B is technically rather the JIS X 0208-1983 shift sequence, but
since I don't see any way to provoke the JIS X 0208-1978 shift
sequence to come flying out of this conversion (ESC $ @), and I'm not
sure there's any difference in the interpretation, let's just say JIS
X 0208.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c32rtomb(3): Use conversion state to handle shift sequences.
For conversion of Unicode scalar values to coding systems requiring
shift sequences, such as ISO-2022-JP, _citrus_iconv_convert will
always produce:
1. a shift sequence from the initial state to some nondefault state,
like from US-ASCII to JIS X 0208
2. the encoding of the desired characater
3. a shift sequence restoring the initial state
This is unnecessary if the output is already in the state needed to
encoded the desired character. For example, this method produces
seven bytes to encode each YEN SIGN in ISO-2022-JP -- and fourteen,
to encode two consecutive ones -- even though the shift sequence is
only three bytes long and once shifted YEN SIGN takes only one byte.
Instead, convert the Unicode scalar value to a locale-dependent wide
character and encode that, by composing
- _citrus_iconv_convert
=> gives us a multibyte encoding of the character from the initial
state (and restoring the initial state afterward)
- mbrtowc with initial conversion state
=> gives us the single wide character representation
XXX If combining characters are possible here, this may fail.
- wcrtomb with caller's conversion tsate
=> gives us a state-dependent multibyte encoding of the character
XXX Is there a cheaper way to convert from Unicode scalar value to
locale-dependent wide character? It is not obvious to me from the
largely undocumented Citrus machinery, but it would obviously be
better than this somewhat circuitous Rube Goldberg contraption of
chained multibyte APIs.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

mbrtoc8(3), mbrtoc16(3): Test consuming shift sequences with state.
This has the side effect of testing mbrtoc32(3) because they are both
defined in terms of it.
PR lib/58618: mbrtocN(3) fails to keep shift state

c8rtomb(3), c16rtomb(3), c32rtomb(3): Suggest MB_LEN_MAX in example.
This way it avoids variable-length arrays, by always allocating the
maximum space that could be occupied by MB_CUR_MAX.

mbrtoc32(3): Use conversion state to handle shift sequences.
PR lib/58618: mbrtocN(3) fails to keep shift state

mbrtoc32(3): Fix name and type of mbrtowc_l return value.
This was from `int mbtowc_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to mbrtowc_l. Caught by
lint.
`mb_len' avoids (harmless) clash with standard C function mblen(3).
PR lib/58618: mbrtocN(3) fails to keep shift state

c32rtomb(3): Fix type of wcrtomb_l return value.
This was from `int wctomb_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to wcrtomb_l. Caught by
lint.
`wc_len' mirrors `mb_len' in the complementary code in mbrtoc32(3) to
avoid clash with standard C function mblen(3).
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3), c16rtomb(3), c32rtomb(3): Attempt to simplify language.

c8rtomb(3), c16rtomb(3), c32rtomb(3): Fix null string output case.
This ignores c8/c16/c32, produces no output anywhere, and just resets
ps to the initial conversion state.
Also just use 0 in the example, not '\0' or L'\0'. This works for
C11, which prefers '\0' and L'\0', for and C23, which introduced the
new u8'\0', u'\0' (UTF-16), and U'\0' (UTF-32).
c16rtomb, c32rtomb, mbrtoc8: fix page numbers in comments
mbrtoc8(3), mbrtoc16(3), mbrtoc32(3): Say 0 for zero code unit.
Rather than deal with differences between C11 and C23 in notation,
'\0' vs L'\0' vs u8'\0' vs u'\0' vs U'\0'.
uchar.h: Include <sys/featuretest.h> before testing _*_SOURCE.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

uchar.h: Need <sys/cdefs.h> for __restrict.
PR lib/52374: <uchar.h> missing

uchar.h: Simplify __cpp_char8_t and __cplusplus conditionals.
No functional change intended.
PR lib/52374: <uchar.h> missing

tests/lib/libc/locale/t_uchar: Test for char8_t, mbrtoc8, c8rtomb.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

tests/t_uchar: fix copy-and-paste typo
 1.1.2.1 15-Aug-2024  martin file t_c32rtomb.c was added on branch netbsd-10 on 2024-10-14 17:20:19 +0000
 1.7 19-Aug-2024  riastradh branches: 1.7.2; 1.7.6;
c32rtomb(3): Use conversion state to handle shift sequences.

For conversion of Unicode scalar values to coding systems requiring
shift sequences, such as ISO-2022-JP, _citrus_iconv_convert will
always produce:

1. a shift sequence from the initial state to some nondefault state,
like from US-ASCII to JIS X 0208
2. the encoding of the desired characater
3. a shift sequence restoring the initial state

This is unnecessary if the output is already in the state needed to
encoded the desired character. For example, this method produces
seven bytes to encode each YEN SIGN in ISO-2022-JP -- and fourteen,
to encode two consecutive ones -- even though the shift sequence is
only three bytes long and once shifted YEN SIGN takes only one byte.

Instead, convert the Unicode scalar value to a locale-dependent wide
character and encode that, by composing

- _citrus_iconv_convert
=> gives us a multibyte encoding of the character from the initial
state (and restoring the initial state afterward)
- mbrtowc with initial conversion state
=> gives us the single wide character representation
XXX If combining characters are possible here, this may fail.
- wcrtomb with caller's conversion tsate
=> gives us a state-dependent multibyte encoding of the character

XXX Is there a cheaper way to convert from Unicode scalar value to
locale-dependent wide character? It is not obvious to me from the
largely undocumented Citrus machinery, but it would obviously be
better than this somewhat circuitous Rube Goldberg contraption of
chained multibyte APIs.

PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences
 1.6 19-Aug-2024  riastradh t_c8rtomb, t_c16rtomb: Simplify comment.

ESC $ B is technically rather the JIS X 0208-1983 shift sequence, but
since I don't see any way to provoke the JIS X 0208-1978 shift
sequence to come flying out of this conversion (ESC $ @), and I'm not
sure there's any difference in the interpretation, let's just say JIS
X 0208.

PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences
 1.5 18-Aug-2024  riastradh c8rtomb(3): Fix digit error in shift sequence test.

PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences
 1.4 18-Aug-2024  riastradh c8rtomb(3), c16rtomb(3), c32rtomb(3): Test stateful shift sequences.

PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences
 1.3 18-Aug-2024  riastradh c8rtomb(3), c16rtomb(3): Fix NUL handling.

PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong
 1.2 17-Aug-2024  riastradh c8rtomb(3), c16rtomb(3): Add tests for incomplete NUL termination.

PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong
 1.1 15-Aug-2024  riastradh libc: New functions c8rtomb(3) and mbrtoc8(3).

New in C23, for converting from UTF-8 to locale-dependent multibyte
sequences (c8rtomb) or vice versa (mbrtoc8), along with the new type
char8_t.

Conditional on either:
- _NETBSD_SOURCE
- _ISOC23_SOURCE
- __STDC_VERSION__ >= 202311L

(Riding the libc minor bump from this morning for the UTF-16/UTF-32
versions from C11.)

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
 1.7.6.2 02-Aug-2025  perseant Sync with HEAD
 1.7.6.1 19-Aug-2024  perseant file t_c8rtomb.c was added on branch perseant-exfatfs on 2025-08-02 05:58:05 +0000
 1.7.2.2 14-Oct-2024  martin Pull up following revision(s) (requested by riastradh in ticket #976):

lib/libc/locale/c32rtomb.3: revision 1.10
lib/libc/locale/c32rtomb.3: revision 1.9
lib/libc/locale/c32rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc32.c: revision 1.1
distrib/sets/lists/base/shl.mi: revision 1.988
lib/libc/include/namespace.h: revision 1.204
lib/libc/include/namespace.h: revision 1.205
lib/libc/locale/mbrtoc16.3: revision 1.1
lib/libc/locale/mbrtoc16.c: revision 1.1
lib/libc/locale/mbrtoc16.3: revision 1.2
lib/libc/locale/mbrtoc16.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.3
lib/libc/locale/mbrtoc16.c: revision 1.3
lib/libc/locale/mbrtoc32.3: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.1
tests/lib/libc/locale/t_c16rtomb.c: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.4
lib/libc/locale/mbrtoc16.c: revision 1.4
lib/libc/locale/mbrtoc32.3: revision 1.2
tests/lib/libc/locale/t_c16rtomb.c: revision 1.2
lib/libc/locale/mbrtoc32.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.5
lib/libc/locale/mbrtoc16.c: revision 1.5
lib/libc/locale/mbrtoc32.3: revision 1.3
tests/lib/libc/locale/t_c16rtomb.c: revision 1.3
lib/libc/locale/mbrtoc32.c: revision 1.4
lib/libc/locale/mbrtoc16.3: revision 1.6
lib/libc/locale/mbrtoc16.c: revision 1.6
lib/libc/locale/mbrtoc32.3: revision 1.4
tests/lib/libc/locale/t_c16rtomb.c: revision 1.4
lib/libc/locale/mbrtoc32.c: revision 1.5
lib/libc/locale/mbrtoc16.3: revision 1.7
lib/libc/locale/mbrtoc16.c: revision 1.7
lib/libc/locale/mbrtoc32.3: revision 1.5
tests/lib/libc/locale/t_c16rtomb.c: revision 1.5
lib/libc/locale/mbrtoc32.c: revision 1.6
lib/libc/locale/mbrtoc16.3: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.6
tests/lib/libc/locale/t_c16rtomb.c: revision 1.6
lib/libc/locale/mbrtoc32.c: revision 1.7
lib/libc/locale/mbrtoc16.3: revision 1.9
lib/libc/locale/mbrtoc32.3: revision 1.7
lib/libc/locale/mbrtoc32.c: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.8
lib/libc/locale/mbrtoc32.c: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2468
lib/libc/locale/mbrtoc32.3: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2469
lib/libc/locale/c32rtomb.h: revision 1.1
lib/libc/locale/c32rtomb.h: revision 1.2
include/Makefile: revision 1.147
share/man/man3/uchar.3: revision 1.1
share/man/man3/uchar.3: revision 1.2
tests/lib/libc/locale/t_c32rtomb.c: revision 1.1
distrib/sets/lists/comp/mi: revision 1.2470
lib/libc/locale/c16rtomb.3: revision 1.1
lib/libc/locale/c16rtomb.c: revision 1.1
lib/libc/locale/c16rtomb.3: revision 1.2
lib/libc/locale/c16rtomb.c: revision 1.2
lib/libc/locale/c16rtomb.3: revision 1.3
lib/libc/locale/c16rtomb.c: revision 1.3
lib/libc/locale/c16rtomb.3: revision 1.4
lib/libc/locale/c16rtomb.c: revision 1.4
lib/libc/locale/c16rtomb.3: revision 1.5
lib/libc/locale/c16rtomb.c: revision 1.5
lib/libc/locale/c16rtomb.3: revision 1.6
lib/libc/locale/c16rtomb.c: revision 1.6
lib/libc/locale/c16rtomb.3: revision 1.7
lib/libc/locale/c16rtomb.c: revision 1.7
lib/libc/locale/c16rtomb.3: revision 1.8
lib/libc/locale/c16rtomb.3: revision 1.9
distrib/sets/lists/tests/mi: revision 1.1330
distrib/sets/lists/tests/mi: revision 1.1331
distrib/sets/lists/tests/mi: revision 1.1332
tests/lib/libc/locale/t_uchar.c: revision 1.1
tests/lib/libc/locale/t_uchar.c: revision 1.2
tests/lib/libc/locale/t_uchar.c: revision 1.3
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.3
include/uchar.h: revision 1.1
include/uchar.h: revision 1.2
include/uchar.h: revision 1.3
include/uchar.h: revision 1.4
include/uchar.h: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.1
include/uchar.h: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.2
tests/lib/libc/locale/t_c8rtomb.c: revision 1.3
tests/lib/libc/locale/t_c8rtomb.c: revision 1.4
share/man/man3/Makefile: revision 1.93
tests/lib/libc/locale/t_c8rtomb.c: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.7
lib/libc/shlib_version: revision 1.297
lib/libc/locale/c16rtomb.3: revision 1.10
lib/libc/locale/c16rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.10
tests/lib/libc/locale/Makefile: revision 1.15
tests/lib/libc/locale/Makefile: revision 1.16
tests/lib/libc/locale/Makefile: revision 1.17
tests/lib/libc/locale/Makefile: revision 1.18
distrib/sets/lists/debug/mi: revision 1.442
distrib/sets/lists/debug/mi: revision 1.443
distrib/sets/lists/debug/mi: revision 1.444
lib/libc/locale/c8rtomb.3: revision 1.1
lib/libc/locale/c8rtomb.c: revision 1.1
lib/libc/locale/c8rtomb.3: revision 1.2
lib/libc/locale/c8rtomb.c: revision 1.2
lib/libc/locale/c8rtomb.3: revision 1.3
lib/libc/locale/c8rtomb.c: revision 1.3
lib/libc/locale/c8rtomb.3: revision 1.4
lib/libc/locale/c8rtomb.c: revision 1.4
lib/libc/locale/c8rtomb.3: revision 1.5
lib/libc/locale/c8rtomb.c: revision 1.5
lib/libc/locale/c8rtomb.3: revision 1.6
lib/libc/locale/c8rtomb.c: revision 1.6
lib/libc/locale/c8rtomb.3: revision 1.7
lib/libc/locale/c8rtomb.3: revision 1.8
lib/libc/locale/c8rtomb.3: revision 1.9
lib/libc/locale/mbrtoc32.h: revision 1.1
lib/libc/locale/mbrtoc32.h: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.1
lib/libc/locale/mbrtoc8.3: revision 1.1
lib/libc/locale/mbrtoc8.c: revision 1.2
lib/libc/locale/mbrtoc8.3: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc8.3: revision 1.3
lib/libc/locale/mbrtoc8.c: revision 1.4
lib/libc/locale/mbrtoc8.3: revision 1.4
lib/libc/locale/Makefile.inc: revision 1.66
lib/libc/locale/mbrtoc8.c: revision 1.5
lib/libc/locale/mbrtoc8.3: revision 1.5
lib/libc/locale/Makefile.inc: revision 1.67
lib/libc/locale/mbrtoc8.c: revision 1.6
lib/libc/locale/mbrtoc8.3: revision 1.6
lib/libc/locale/mbrtoc8.c: revision 1.7
lib/libc/locale/mbrtoc8.3: revision 1.7
lib/libc/locale/mbrtoc8.c: revision 1.8
lib/libc/locale/c32rtomb.3: revision 1.1
lib/libc/locale/c32rtomb.c: revision 1.1
lib/libc/locale/c32rtomb.3: revision 1.2
lib/libc/locale/c32rtomb.c: revision 1.2
lib/libc/locale/c32rtomb.3: revision 1.3
lib/libc/locale/c32rtomb.c: revision 1.3
lib/libc/locale/c32rtomb.3: revision 1.4
lib/libc/locale/c32rtomb.c: revision 1.4
lib/libc/locale/c32rtomb.3: revision 1.5
lib/libc/locale/c32rtomb.c: revision 1.5
lib/libc/locale/c32rtomb.3: revision 1.6
lib/libc/locale/c32rtomb.c: revision 1.6
lib/libc/locale/c32rtomb.3: revision 1.7
lib/libc/locale/c32rtomb.3: revision 1.8

(all via patch)


tests/lib/libc/locale/Makefile: Sort.
No functional change intended.
Preparation for PR lib/52374.

uchar.h: New header file for C11 (and C++11) compliance.

Implementation of the new functions mbrtoc16, c16rtomb, mbrtoc32, and
c32rtomb to come later. Updates for C23 to come later.
PR lib/52374: <uchar.h> missing

libc: New C11 functions mbrtoc16, mbrtoc32, c16rtomb, c32rtomb.

The mbrtoc16/32 functions read mulitbyte strings according to the
current locale into UTF-16/32 code unit sequences; the c16/32rtomb
functions write UTF-16/32 code unit sequences into multibyte strings
according to the current locale. The `r' means restartable: they
work incrementally and pick up where they left off.

NOTE: This bumps the libc minor version, since it adds new symbols.

PR lib/52374: <uchar.h> missing
mbrtoc16(3), mbrtoc32(3): Fix \n in man page examples.
Need to write \en to pacify roff.
PR lib/52374: <uchar.h> missing

c16rtomb(3), c32rtomb(3): Fix more \n in man pages.
Also, tighten an assertion: we left room for a NUL byte at the end.
PR lib/52374: <uchar.h> missing

libc: Use the more idiomatic alignof from stdalign.h.
No functional change intended.
PR lib/52374: <uchar.h> missing

mbrtoc16(3): Simplify surrogate state test.

Turn the finer-grained test into an assertion.
No semantic change intended: we are supposed to control this state,
and we always arrange it this way. (But in principle this could
change the behaviour of buggy programs that violate the mbstate_t
abstraction.)
PR lib/52374: <uchar.h> missing

libc: New functions c8rtomb(3) and mbrtoc8(3).

New in C23, for converting from UTF-8 to locale-dependent multibyte
sequences (c8rtomb) or vice versa (mbrtoc8), along with the new type
char8_t.

Conditional on either:
- _NETBSD_SOURCE
- _ISOC23_SOURCE
- __STDC_VERSION__ >= 202311L
(Riding the libc minor bump from this morning for the UTF-16/UTF-32
versions from C11.)

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
libc: c32rtomb and mbrtoc32 are used internally, so weak-alias them.
PR lib/52374: <uchar.h> missing
c8rtomb(3), mbrtoc8(3): Use namespace.h to get private aliases.

This way applications defining the symbols c32rtomb or mbrtoc32 won't
clobber our private definitions, which are slightly more constrained
about their use of mbstate_t than is obvious from the interface
contract.

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
mbrtoc16(3), mbrtoc32(3): brush up markup

Split long .Fn lines into Fo/Fa/Fc. Dont indent the list of return
values. Don't use artisanal -width.

Untabify code examples - indented literal displays don't have correct
tab stops consistent with tab stops in the fixed font code, so the
lines end up misaligned in the PostScript output.

c16rtomb(3), c32rtomb(3): brush up markup

mbrtoc16(3), mbrtoc32(3): Simplify return value language.
Also expand BMP only once.
PR lib/52374: <uchar.h> missing

mbrtoc16(3), mbrtoc32(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc32(3): Clarify control flow.
No need for another goto here; let's keep it clearly structured with
a single `out' label.
No functional change intended.
PR lib/52374: <uchar.h> missing

c8rtomb(3), mbrtoc8(3): brush up markup

mbrtoc8(3): Simplify return value language.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Specify what happens if ps is null.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Specify what happens when ps is null.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Work on deturgidifying prose.
Still maybe not great but at least there's less jargon in most of the
text, without really losing any content.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Work on deturgidifying prose.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Restore word accidentally removed.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Restore word accidentally removed.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c8rtomb(3): Fix possible error descriptions.
The argument c8 can't be a surrogate code point itself (they're in
the range [0xd800,0xdfff], beyond 8-bit values), but the bits of a
surrogate code point could be forced into the UTF-8 format, which is
also invalid.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Attempt a deturgidification pass.
Limit the jargon around surrogates.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Clarify prose and fix example in caveat.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
c16rtomb(3), c32rtomb(3), mbrtoc16(3), mbrtoc32(3): xref c8 versions.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

pass lint, XXX see lint bug.

libc: Add _l variants of the cNrtomb and mbrtocN functions.
These accept an explicit locale parameter, rather than using the
current locale.
Visible under _NETBSD_SOURCE, not exposed otherwise.
NOTE: This adds libc symbols. Riding the libc minor bump for the
non-_l variants of these from two days ago -- hope that's not pushing
it too far.
PR lib/58613: c*rtomb, mbrtoc* should have locale-parametric _l
variants

c8rtomb(3), c16rtomb(3): Add tests for incomplete NUL termination.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3): Fix NUL handling.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3), c32rtomb(3): Test stateful shift sequences.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Fix digit error in shift sequence test.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Nix __CTASSERT after case label.
I put this in to make it (machine-verifiably) clear that zeroing the
state is the same as returning to the initial conversion state, as
the standard requires, but this is causing build trouble (and will
likely cause more trouble if pulled up) because some definitions of
__CTASSERT make a declaration which is forbidden after a label, so
let's remove it.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8(3): Fix pasto in comment at top.
No functional change intended.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8: remove lint-specific workarounds
No binary change.

mbrtoc8: fix comments

mbrtoc16, mbrtoc32: fix comments, remove lint-specific workarounds
No binary change.
t_c8rtomb, t_c16rtomb: Simplify comment.
ESC $ B is technically rather the JIS X 0208-1983 shift sequence, but
since I don't see any way to provoke the JIS X 0208-1978 shift
sequence to come flying out of this conversion (ESC $ @), and I'm not
sure there's any difference in the interpretation, let's just say JIS
X 0208.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c32rtomb(3): Use conversion state to handle shift sequences.
For conversion of Unicode scalar values to coding systems requiring
shift sequences, such as ISO-2022-JP, _citrus_iconv_convert will
always produce:
1. a shift sequence from the initial state to some nondefault state,
like from US-ASCII to JIS X 0208
2. the encoding of the desired characater
3. a shift sequence restoring the initial state
This is unnecessary if the output is already in the state needed to
encoded the desired character. For example, this method produces
seven bytes to encode each YEN SIGN in ISO-2022-JP -- and fourteen,
to encode two consecutive ones -- even though the shift sequence is
only three bytes long and once shifted YEN SIGN takes only one byte.
Instead, convert the Unicode scalar value to a locale-dependent wide
character and encode that, by composing
- _citrus_iconv_convert
=> gives us a multibyte encoding of the character from the initial
state (and restoring the initial state afterward)
- mbrtowc with initial conversion state
=> gives us the single wide character representation
XXX If combining characters are possible here, this may fail.
- wcrtomb with caller's conversion tsate
=> gives us a state-dependent multibyte encoding of the character
XXX Is there a cheaper way to convert from Unicode scalar value to
locale-dependent wide character? It is not obvious to me from the
largely undocumented Citrus machinery, but it would obviously be
better than this somewhat circuitous Rube Goldberg contraption of
chained multibyte APIs.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

mbrtoc8(3), mbrtoc16(3): Test consuming shift sequences with state.
This has the side effect of testing mbrtoc32(3) because they are both
defined in terms of it.
PR lib/58618: mbrtocN(3) fails to keep shift state

c8rtomb(3), c16rtomb(3), c32rtomb(3): Suggest MB_LEN_MAX in example.
This way it avoids variable-length arrays, by always allocating the
maximum space that could be occupied by MB_CUR_MAX.

mbrtoc32(3): Use conversion state to handle shift sequences.
PR lib/58618: mbrtocN(3) fails to keep shift state

mbrtoc32(3): Fix name and type of mbrtowc_l return value.
This was from `int mbtowc_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to mbrtowc_l. Caught by
lint.
`mb_len' avoids (harmless) clash with standard C function mblen(3).
PR lib/58618: mbrtocN(3) fails to keep shift state

c32rtomb(3): Fix type of wcrtomb_l return value.
This was from `int wctomb_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to wcrtomb_l. Caught by
lint.
`wc_len' mirrors `mb_len' in the complementary code in mbrtoc32(3) to
avoid clash with standard C function mblen(3).
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3), c16rtomb(3), c32rtomb(3): Attempt to simplify language.

c8rtomb(3), c16rtomb(3), c32rtomb(3): Fix null string output case.
This ignores c8/c16/c32, produces no output anywhere, and just resets
ps to the initial conversion state.
Also just use 0 in the example, not '\0' or L'\0'. This works for
C11, which prefers '\0' and L'\0', for and C23, which introduced the
new u8'\0', u'\0' (UTF-16), and U'\0' (UTF-32).
c16rtomb, c32rtomb, mbrtoc8: fix page numbers in comments
mbrtoc8(3), mbrtoc16(3), mbrtoc32(3): Say 0 for zero code unit.
Rather than deal with differences between C11 and C23 in notation,
'\0' vs L'\0' vs u8'\0' vs u'\0' vs U'\0'.
uchar.h: Include <sys/featuretest.h> before testing _*_SOURCE.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

uchar.h: Need <sys/cdefs.h> for __restrict.
PR lib/52374: <uchar.h> missing

uchar.h: Simplify __cpp_char8_t and __cplusplus conditionals.
No functional change intended.
PR lib/52374: <uchar.h> missing

tests/lib/libc/locale/t_uchar: Test for char8_t, mbrtoc8, c8rtomb.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

tests/t_uchar: fix copy-and-paste typo
 1.7.2.1 19-Aug-2024  martin file t_c8rtomb.c was added on branch netbsd-10 on 2024-10-14 17:20:19 +0000
 1.2 15-Jul-2011  jruoho Rename two test files to get functional scope (and avoid confusion
with ctype(3)). No functional change.
 1.1 09-Apr-2011  pgoyette atf-ify the various locale tests
 1.2 15-Jul-2011  jruoho Rename two test files to get functional scope (and avoid confusion
with ctype(3)). No functional change.
 1.1 09-Apr-2011  pgoyette atf-ify the various locale tests
 1.3 24-May-2022  andvar fix various typos in comment, documentation and log messages.
 1.2 01-Jun-2017  perseant Add tests for btowc(3)/wctob(3) and enable compilation of the test for
digittoint(3).

The digittoint(3) test is skipped since we don't provide that function yet.

One of the test cases for btowc(3) is also skipped, since it tests conversion
to Unicode---whereas our wchar_t representation is locale-dependent.
 1.1 30-May-2017  perseant Add test cases for sprintf/sscanf/strto{d,l} and the is* and isw* ctype functions, for single-byte encodings
 1.2 23-Jul-2017  perseant Add missing files from last commit:

Move Unicode <-> ku/ten mapping into the individual codec modules.
Mapping is based on existing iconv data for single-byte encodings,
and included for several, but not all, multibyte encodings.
 1.1 14-Jul-2017  perseant branches: 1.1.2;
file t_ducet.c was initially added on branch perseant-stdc-iso10646.
 1.1.2.2 23-Jul-2017  perseant Add Unicode copyright notice and more verbose DUCET test.
 1.1.2.1 14-Jul-2017  perseant Initial commit of a mostly-working implementation of __STDC_ISO_10646__,
with collation support using the Unicode Collation Algorithm.

The conversion from men/ku/ten form to Unicode is a gross hack at present.
Fixing this, and fleshing out the LC_COLLATE locale component, are next
on the agenda.
 1.5 12-Jul-2017  perseant Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.
 1.4 21-Jan-2014  yamt branches: 1.4.4; 1.4.20;
fix comment typos pointed out by uebayasi
 1.3 20-Jan-2014  yamt - fix funopen usage
- some more checks
- remove a bogus test case (bad_eucJP_getwc) PR/47660 (Julio Merino)
- add XXX comments
 1.2 17-Mar-2013  jmmv branches: 1.2.4;
Mark two routinely-broken tests as expected failures referencing PR lib/47660.
 1.1 28-Feb-2013  christos regression tests for wide char i/o. Currently there are failures.
 1.2.4.3 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.2.4.2 23-Jun-2013  tls resync from head
 1.2.4.1 17-Mar-2013  tls file t_io.c was added on branch tls-maxphys on 2013-06-23 06:28:56 +0000
 1.4.20.1 15-Mar-2018  martin Pull up following revision(s) (requested by maya in ticket #608):
tests/lib/libc/locale/t_sprintf.c: revision 1.3
tests/lib/libc/locale/t_wctomb.c: revision 1.5
tests/lib/libc/locale/t_io.c: revision 1.5
tests/lib/libc/locale/t_wcstod.c: revision 1.4
tests/lib/libc/locale/t_mbstowcs.c: revision 1.2
tests/lib/libc/locale/t_wctype.c: revision 1.2
tests/lib/libc/locale/t_mbrtowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.3
Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.

Separate the C/POSIX locale test from the rest; make it more thorough
and more correct. This fixes a problem reported by martin@ when the
test is compiled with -funsigned-char.
 1.4.4.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.4.4.1 21-Jan-2014  yamt file t_io.c was added on branch yamt-pagecache on 2014-05-22 11:42:20 +0000
 1.3 20-Aug-2024  riastradh branches: 1.3.2; 1.3.6;
mbrtoc32(3): Use conversion state to handle shift sequences.

PR lib/58618: mbrtocN(3) fails to keep shift state
 1.2 19-Aug-2024  riastradh mbrtoc8(3), mbrtoc16(3): Test consuming shift sequences with state.

This has the side effect of testing mbrtoc32(3) because they are both
defined in terms of it.

PR lib/58618: mbrtocN(3) fails to keep shift state
 1.1 15-Aug-2024  riastradh libc: New C11 functions mbrtoc16, mbrtoc32, c16rtomb, c32rtomb.

The mbrtoc16/32 functions read mulitbyte strings according to the
current locale into UTF-16/32 code unit sequences; the c16/32rtomb
functions write UTF-16/32 code unit sequences into multibyte strings
according to the current locale. The `r' means restartable: they
work incrementally and pick up where they left off.

NOTE: This bumps the libc minor version, since it adds new symbols.

PR lib/52374: <uchar.h> missing
 1.3.6.2 02-Aug-2025  perseant Sync with HEAD
 1.3.6.1 20-Aug-2024  perseant file t_mbrtoc16.c was added on branch perseant-exfatfs on 2025-08-02 05:58:05 +0000
 1.3.2.2 14-Oct-2024  martin Pull up following revision(s) (requested by riastradh in ticket #976):

lib/libc/locale/c32rtomb.3: revision 1.10
lib/libc/locale/c32rtomb.3: revision 1.9
lib/libc/locale/c32rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc32.c: revision 1.1
distrib/sets/lists/base/shl.mi: revision 1.988
lib/libc/include/namespace.h: revision 1.204
lib/libc/include/namespace.h: revision 1.205
lib/libc/locale/mbrtoc16.3: revision 1.1
lib/libc/locale/mbrtoc16.c: revision 1.1
lib/libc/locale/mbrtoc16.3: revision 1.2
lib/libc/locale/mbrtoc16.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.3
lib/libc/locale/mbrtoc16.c: revision 1.3
lib/libc/locale/mbrtoc32.3: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.1
tests/lib/libc/locale/t_c16rtomb.c: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.4
lib/libc/locale/mbrtoc16.c: revision 1.4
lib/libc/locale/mbrtoc32.3: revision 1.2
tests/lib/libc/locale/t_c16rtomb.c: revision 1.2
lib/libc/locale/mbrtoc32.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.5
lib/libc/locale/mbrtoc16.c: revision 1.5
lib/libc/locale/mbrtoc32.3: revision 1.3
tests/lib/libc/locale/t_c16rtomb.c: revision 1.3
lib/libc/locale/mbrtoc32.c: revision 1.4
lib/libc/locale/mbrtoc16.3: revision 1.6
lib/libc/locale/mbrtoc16.c: revision 1.6
lib/libc/locale/mbrtoc32.3: revision 1.4
tests/lib/libc/locale/t_c16rtomb.c: revision 1.4
lib/libc/locale/mbrtoc32.c: revision 1.5
lib/libc/locale/mbrtoc16.3: revision 1.7
lib/libc/locale/mbrtoc16.c: revision 1.7
lib/libc/locale/mbrtoc32.3: revision 1.5
tests/lib/libc/locale/t_c16rtomb.c: revision 1.5
lib/libc/locale/mbrtoc32.c: revision 1.6
lib/libc/locale/mbrtoc16.3: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.6
tests/lib/libc/locale/t_c16rtomb.c: revision 1.6
lib/libc/locale/mbrtoc32.c: revision 1.7
lib/libc/locale/mbrtoc16.3: revision 1.9
lib/libc/locale/mbrtoc32.3: revision 1.7
lib/libc/locale/mbrtoc32.c: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.8
lib/libc/locale/mbrtoc32.c: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2468
lib/libc/locale/mbrtoc32.3: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2469
lib/libc/locale/c32rtomb.h: revision 1.1
lib/libc/locale/c32rtomb.h: revision 1.2
include/Makefile: revision 1.147
share/man/man3/uchar.3: revision 1.1
share/man/man3/uchar.3: revision 1.2
tests/lib/libc/locale/t_c32rtomb.c: revision 1.1
distrib/sets/lists/comp/mi: revision 1.2470
lib/libc/locale/c16rtomb.3: revision 1.1
lib/libc/locale/c16rtomb.c: revision 1.1
lib/libc/locale/c16rtomb.3: revision 1.2
lib/libc/locale/c16rtomb.c: revision 1.2
lib/libc/locale/c16rtomb.3: revision 1.3
lib/libc/locale/c16rtomb.c: revision 1.3
lib/libc/locale/c16rtomb.3: revision 1.4
lib/libc/locale/c16rtomb.c: revision 1.4
lib/libc/locale/c16rtomb.3: revision 1.5
lib/libc/locale/c16rtomb.c: revision 1.5
lib/libc/locale/c16rtomb.3: revision 1.6
lib/libc/locale/c16rtomb.c: revision 1.6
lib/libc/locale/c16rtomb.3: revision 1.7
lib/libc/locale/c16rtomb.c: revision 1.7
lib/libc/locale/c16rtomb.3: revision 1.8
lib/libc/locale/c16rtomb.3: revision 1.9
distrib/sets/lists/tests/mi: revision 1.1330
distrib/sets/lists/tests/mi: revision 1.1331
distrib/sets/lists/tests/mi: revision 1.1332
tests/lib/libc/locale/t_uchar.c: revision 1.1
tests/lib/libc/locale/t_uchar.c: revision 1.2
tests/lib/libc/locale/t_uchar.c: revision 1.3
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.3
include/uchar.h: revision 1.1
include/uchar.h: revision 1.2
include/uchar.h: revision 1.3
include/uchar.h: revision 1.4
include/uchar.h: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.1
include/uchar.h: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.2
tests/lib/libc/locale/t_c8rtomb.c: revision 1.3
tests/lib/libc/locale/t_c8rtomb.c: revision 1.4
share/man/man3/Makefile: revision 1.93
tests/lib/libc/locale/t_c8rtomb.c: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.7
lib/libc/shlib_version: revision 1.297
lib/libc/locale/c16rtomb.3: revision 1.10
lib/libc/locale/c16rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.10
tests/lib/libc/locale/Makefile: revision 1.15
tests/lib/libc/locale/Makefile: revision 1.16
tests/lib/libc/locale/Makefile: revision 1.17
tests/lib/libc/locale/Makefile: revision 1.18
distrib/sets/lists/debug/mi: revision 1.442
distrib/sets/lists/debug/mi: revision 1.443
distrib/sets/lists/debug/mi: revision 1.444
lib/libc/locale/c8rtomb.3: revision 1.1
lib/libc/locale/c8rtomb.c: revision 1.1
lib/libc/locale/c8rtomb.3: revision 1.2
lib/libc/locale/c8rtomb.c: revision 1.2
lib/libc/locale/c8rtomb.3: revision 1.3
lib/libc/locale/c8rtomb.c: revision 1.3
lib/libc/locale/c8rtomb.3: revision 1.4
lib/libc/locale/c8rtomb.c: revision 1.4
lib/libc/locale/c8rtomb.3: revision 1.5
lib/libc/locale/c8rtomb.c: revision 1.5
lib/libc/locale/c8rtomb.3: revision 1.6
lib/libc/locale/c8rtomb.c: revision 1.6
lib/libc/locale/c8rtomb.3: revision 1.7
lib/libc/locale/c8rtomb.3: revision 1.8
lib/libc/locale/c8rtomb.3: revision 1.9
lib/libc/locale/mbrtoc32.h: revision 1.1
lib/libc/locale/mbrtoc32.h: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.1
lib/libc/locale/mbrtoc8.3: revision 1.1
lib/libc/locale/mbrtoc8.c: revision 1.2
lib/libc/locale/mbrtoc8.3: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc8.3: revision 1.3
lib/libc/locale/mbrtoc8.c: revision 1.4
lib/libc/locale/mbrtoc8.3: revision 1.4
lib/libc/locale/Makefile.inc: revision 1.66
lib/libc/locale/mbrtoc8.c: revision 1.5
lib/libc/locale/mbrtoc8.3: revision 1.5
lib/libc/locale/Makefile.inc: revision 1.67
lib/libc/locale/mbrtoc8.c: revision 1.6
lib/libc/locale/mbrtoc8.3: revision 1.6
lib/libc/locale/mbrtoc8.c: revision 1.7
lib/libc/locale/mbrtoc8.3: revision 1.7
lib/libc/locale/mbrtoc8.c: revision 1.8
lib/libc/locale/c32rtomb.3: revision 1.1
lib/libc/locale/c32rtomb.c: revision 1.1
lib/libc/locale/c32rtomb.3: revision 1.2
lib/libc/locale/c32rtomb.c: revision 1.2
lib/libc/locale/c32rtomb.3: revision 1.3
lib/libc/locale/c32rtomb.c: revision 1.3
lib/libc/locale/c32rtomb.3: revision 1.4
lib/libc/locale/c32rtomb.c: revision 1.4
lib/libc/locale/c32rtomb.3: revision 1.5
lib/libc/locale/c32rtomb.c: revision 1.5
lib/libc/locale/c32rtomb.3: revision 1.6
lib/libc/locale/c32rtomb.c: revision 1.6
lib/libc/locale/c32rtomb.3: revision 1.7
lib/libc/locale/c32rtomb.3: revision 1.8

(all via patch)


tests/lib/libc/locale/Makefile: Sort.
No functional change intended.
Preparation for PR lib/52374.

uchar.h: New header file for C11 (and C++11) compliance.

Implementation of the new functions mbrtoc16, c16rtomb, mbrtoc32, and
c32rtomb to come later. Updates for C23 to come later.
PR lib/52374: <uchar.h> missing

libc: New C11 functions mbrtoc16, mbrtoc32, c16rtomb, c32rtomb.

The mbrtoc16/32 functions read mulitbyte strings according to the
current locale into UTF-16/32 code unit sequences; the c16/32rtomb
functions write UTF-16/32 code unit sequences into multibyte strings
according to the current locale. The `r' means restartable: they
work incrementally and pick up where they left off.

NOTE: This bumps the libc minor version, since it adds new symbols.

PR lib/52374: <uchar.h> missing
mbrtoc16(3), mbrtoc32(3): Fix \n in man page examples.
Need to write \en to pacify roff.
PR lib/52374: <uchar.h> missing

c16rtomb(3), c32rtomb(3): Fix more \n in man pages.
Also, tighten an assertion: we left room for a NUL byte at the end.
PR lib/52374: <uchar.h> missing

libc: Use the more idiomatic alignof from stdalign.h.
No functional change intended.
PR lib/52374: <uchar.h> missing

mbrtoc16(3): Simplify surrogate state test.

Turn the finer-grained test into an assertion.
No semantic change intended: we are supposed to control this state,
and we always arrange it this way. (But in principle this could
change the behaviour of buggy programs that violate the mbstate_t
abstraction.)
PR lib/52374: <uchar.h> missing

libc: New functions c8rtomb(3) and mbrtoc8(3).

New in C23, for converting from UTF-8 to locale-dependent multibyte
sequences (c8rtomb) or vice versa (mbrtoc8), along with the new type
char8_t.

Conditional on either:
- _NETBSD_SOURCE
- _ISOC23_SOURCE
- __STDC_VERSION__ >= 202311L
(Riding the libc minor bump from this morning for the UTF-16/UTF-32
versions from C11.)

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
libc: c32rtomb and mbrtoc32 are used internally, so weak-alias them.
PR lib/52374: <uchar.h> missing
c8rtomb(3), mbrtoc8(3): Use namespace.h to get private aliases.

This way applications defining the symbols c32rtomb or mbrtoc32 won't
clobber our private definitions, which are slightly more constrained
about their use of mbstate_t than is obvious from the interface
contract.

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
mbrtoc16(3), mbrtoc32(3): brush up markup

Split long .Fn lines into Fo/Fa/Fc. Dont indent the list of return
values. Don't use artisanal -width.

Untabify code examples - indented literal displays don't have correct
tab stops consistent with tab stops in the fixed font code, so the
lines end up misaligned in the PostScript output.

c16rtomb(3), c32rtomb(3): brush up markup

mbrtoc16(3), mbrtoc32(3): Simplify return value language.
Also expand BMP only once.
PR lib/52374: <uchar.h> missing

mbrtoc16(3), mbrtoc32(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc32(3): Clarify control flow.
No need for another goto here; let's keep it clearly structured with
a single `out' label.
No functional change intended.
PR lib/52374: <uchar.h> missing

c8rtomb(3), mbrtoc8(3): brush up markup

mbrtoc8(3): Simplify return value language.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Specify what happens if ps is null.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Specify what happens when ps is null.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Work on deturgidifying prose.
Still maybe not great but at least there's less jargon in most of the
text, without really losing any content.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Work on deturgidifying prose.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Restore word accidentally removed.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Restore word accidentally removed.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c8rtomb(3): Fix possible error descriptions.
The argument c8 can't be a surrogate code point itself (they're in
the range [0xd800,0xdfff], beyond 8-bit values), but the bits of a
surrogate code point could be forced into the UTF-8 format, which is
also invalid.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Attempt a deturgidification pass.
Limit the jargon around surrogates.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Clarify prose and fix example in caveat.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
c16rtomb(3), c32rtomb(3), mbrtoc16(3), mbrtoc32(3): xref c8 versions.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

pass lint, XXX see lint bug.

libc: Add _l variants of the cNrtomb and mbrtocN functions.
These accept an explicit locale parameter, rather than using the
current locale.
Visible under _NETBSD_SOURCE, not exposed otherwise.
NOTE: This adds libc symbols. Riding the libc minor bump for the
non-_l variants of these from two days ago -- hope that's not pushing
it too far.
PR lib/58613: c*rtomb, mbrtoc* should have locale-parametric _l
variants

c8rtomb(3), c16rtomb(3): Add tests for incomplete NUL termination.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3): Fix NUL handling.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3), c32rtomb(3): Test stateful shift sequences.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Fix digit error in shift sequence test.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Nix __CTASSERT after case label.
I put this in to make it (machine-verifiably) clear that zeroing the
state is the same as returning to the initial conversion state, as
the standard requires, but this is causing build trouble (and will
likely cause more trouble if pulled up) because some definitions of
__CTASSERT make a declaration which is forbidden after a label, so
let's remove it.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8(3): Fix pasto in comment at top.
No functional change intended.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8: remove lint-specific workarounds
No binary change.

mbrtoc8: fix comments

mbrtoc16, mbrtoc32: fix comments, remove lint-specific workarounds
No binary change.
t_c8rtomb, t_c16rtomb: Simplify comment.
ESC $ B is technically rather the JIS X 0208-1983 shift sequence, but
since I don't see any way to provoke the JIS X 0208-1978 shift
sequence to come flying out of this conversion (ESC $ @), and I'm not
sure there's any difference in the interpretation, let's just say JIS
X 0208.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c32rtomb(3): Use conversion state to handle shift sequences.
For conversion of Unicode scalar values to coding systems requiring
shift sequences, such as ISO-2022-JP, _citrus_iconv_convert will
always produce:
1. a shift sequence from the initial state to some nondefault state,
like from US-ASCII to JIS X 0208
2. the encoding of the desired characater
3. a shift sequence restoring the initial state
This is unnecessary if the output is already in the state needed to
encoded the desired character. For example, this method produces
seven bytes to encode each YEN SIGN in ISO-2022-JP -- and fourteen,
to encode two consecutive ones -- even though the shift sequence is
only three bytes long and once shifted YEN SIGN takes only one byte.
Instead, convert the Unicode scalar value to a locale-dependent wide
character and encode that, by composing
- _citrus_iconv_convert
=> gives us a multibyte encoding of the character from the initial
state (and restoring the initial state afterward)
- mbrtowc with initial conversion state
=> gives us the single wide character representation
XXX If combining characters are possible here, this may fail.
- wcrtomb with caller's conversion tsate
=> gives us a state-dependent multibyte encoding of the character
XXX Is there a cheaper way to convert from Unicode scalar value to
locale-dependent wide character? It is not obvious to me from the
largely undocumented Citrus machinery, but it would obviously be
better than this somewhat circuitous Rube Goldberg contraption of
chained multibyte APIs.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

mbrtoc8(3), mbrtoc16(3): Test consuming shift sequences with state.
This has the side effect of testing mbrtoc32(3) because they are both
defined in terms of it.
PR lib/58618: mbrtocN(3) fails to keep shift state

c8rtomb(3), c16rtomb(3), c32rtomb(3): Suggest MB_LEN_MAX in example.
This way it avoids variable-length arrays, by always allocating the
maximum space that could be occupied by MB_CUR_MAX.

mbrtoc32(3): Use conversion state to handle shift sequences.
PR lib/58618: mbrtocN(3) fails to keep shift state

mbrtoc32(3): Fix name and type of mbrtowc_l return value.
This was from `int mbtowc_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to mbrtowc_l. Caught by
lint.
`mb_len' avoids (harmless) clash with standard C function mblen(3).
PR lib/58618: mbrtocN(3) fails to keep shift state

c32rtomb(3): Fix type of wcrtomb_l return value.
This was from `int wctomb_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to wcrtomb_l. Caught by
lint.
`wc_len' mirrors `mb_len' in the complementary code in mbrtoc32(3) to
avoid clash with standard C function mblen(3).
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3), c16rtomb(3), c32rtomb(3): Attempt to simplify language.

c8rtomb(3), c16rtomb(3), c32rtomb(3): Fix null string output case.
This ignores c8/c16/c32, produces no output anywhere, and just resets
ps to the initial conversion state.
Also just use 0 in the example, not '\0' or L'\0'. This works for
C11, which prefers '\0' and L'\0', for and C23, which introduced the
new u8'\0', u'\0' (UTF-16), and U'\0' (UTF-32).
c16rtomb, c32rtomb, mbrtoc8: fix page numbers in comments
mbrtoc8(3), mbrtoc16(3), mbrtoc32(3): Say 0 for zero code unit.
Rather than deal with differences between C11 and C23 in notation,
'\0' vs L'\0' vs u8'\0' vs u'\0' vs U'\0'.
uchar.h: Include <sys/featuretest.h> before testing _*_SOURCE.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

uchar.h: Need <sys/cdefs.h> for __restrict.
PR lib/52374: <uchar.h> missing

uchar.h: Simplify __cpp_char8_t and __cplusplus conditionals.
No functional change intended.
PR lib/52374: <uchar.h> missing

tests/lib/libc/locale/t_uchar: Test for char8_t, mbrtoc8, c8rtomb.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

tests/t_uchar: fix copy-and-paste typo
 1.3.2.1 20-Aug-2024  martin file t_mbrtoc16.c was added on branch netbsd-10 on 2024-10-14 17:20:19 +0000
 1.1 15-Aug-2024  riastradh branches: 1.1.2; 1.1.6;
libc: New C11 functions mbrtoc16, mbrtoc32, c16rtomb, c32rtomb.

The mbrtoc16/32 functions read mulitbyte strings according to the
current locale into UTF-16/32 code unit sequences; the c16/32rtomb
functions write UTF-16/32 code unit sequences into multibyte strings
according to the current locale. The `r' means restartable: they
work incrementally and pick up where they left off.

NOTE: This bumps the libc minor version, since it adds new symbols.

PR lib/52374: <uchar.h> missing
 1.1.6.2 02-Aug-2025  perseant Sync with HEAD
 1.1.6.1 15-Aug-2024  perseant file t_mbrtoc32.c was added on branch perseant-exfatfs on 2025-08-02 05:58:05 +0000
 1.1.2.2 14-Oct-2024  martin Pull up following revision(s) (requested by riastradh in ticket #976):

lib/libc/locale/c32rtomb.3: revision 1.10
lib/libc/locale/c32rtomb.3: revision 1.9
lib/libc/locale/c32rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc32.c: revision 1.1
distrib/sets/lists/base/shl.mi: revision 1.988
lib/libc/include/namespace.h: revision 1.204
lib/libc/include/namespace.h: revision 1.205
lib/libc/locale/mbrtoc16.3: revision 1.1
lib/libc/locale/mbrtoc16.c: revision 1.1
lib/libc/locale/mbrtoc16.3: revision 1.2
lib/libc/locale/mbrtoc16.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.3
lib/libc/locale/mbrtoc16.c: revision 1.3
lib/libc/locale/mbrtoc32.3: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.1
tests/lib/libc/locale/t_c16rtomb.c: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.4
lib/libc/locale/mbrtoc16.c: revision 1.4
lib/libc/locale/mbrtoc32.3: revision 1.2
tests/lib/libc/locale/t_c16rtomb.c: revision 1.2
lib/libc/locale/mbrtoc32.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.5
lib/libc/locale/mbrtoc16.c: revision 1.5
lib/libc/locale/mbrtoc32.3: revision 1.3
tests/lib/libc/locale/t_c16rtomb.c: revision 1.3
lib/libc/locale/mbrtoc32.c: revision 1.4
lib/libc/locale/mbrtoc16.3: revision 1.6
lib/libc/locale/mbrtoc16.c: revision 1.6
lib/libc/locale/mbrtoc32.3: revision 1.4
tests/lib/libc/locale/t_c16rtomb.c: revision 1.4
lib/libc/locale/mbrtoc32.c: revision 1.5
lib/libc/locale/mbrtoc16.3: revision 1.7
lib/libc/locale/mbrtoc16.c: revision 1.7
lib/libc/locale/mbrtoc32.3: revision 1.5
tests/lib/libc/locale/t_c16rtomb.c: revision 1.5
lib/libc/locale/mbrtoc32.c: revision 1.6
lib/libc/locale/mbrtoc16.3: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.6
tests/lib/libc/locale/t_c16rtomb.c: revision 1.6
lib/libc/locale/mbrtoc32.c: revision 1.7
lib/libc/locale/mbrtoc16.3: revision 1.9
lib/libc/locale/mbrtoc32.3: revision 1.7
lib/libc/locale/mbrtoc32.c: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.8
lib/libc/locale/mbrtoc32.c: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2468
lib/libc/locale/mbrtoc32.3: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2469
lib/libc/locale/c32rtomb.h: revision 1.1
lib/libc/locale/c32rtomb.h: revision 1.2
include/Makefile: revision 1.147
share/man/man3/uchar.3: revision 1.1
share/man/man3/uchar.3: revision 1.2
tests/lib/libc/locale/t_c32rtomb.c: revision 1.1
distrib/sets/lists/comp/mi: revision 1.2470
lib/libc/locale/c16rtomb.3: revision 1.1
lib/libc/locale/c16rtomb.c: revision 1.1
lib/libc/locale/c16rtomb.3: revision 1.2
lib/libc/locale/c16rtomb.c: revision 1.2
lib/libc/locale/c16rtomb.3: revision 1.3
lib/libc/locale/c16rtomb.c: revision 1.3
lib/libc/locale/c16rtomb.3: revision 1.4
lib/libc/locale/c16rtomb.c: revision 1.4
lib/libc/locale/c16rtomb.3: revision 1.5
lib/libc/locale/c16rtomb.c: revision 1.5
lib/libc/locale/c16rtomb.3: revision 1.6
lib/libc/locale/c16rtomb.c: revision 1.6
lib/libc/locale/c16rtomb.3: revision 1.7
lib/libc/locale/c16rtomb.c: revision 1.7
lib/libc/locale/c16rtomb.3: revision 1.8
lib/libc/locale/c16rtomb.3: revision 1.9
distrib/sets/lists/tests/mi: revision 1.1330
distrib/sets/lists/tests/mi: revision 1.1331
distrib/sets/lists/tests/mi: revision 1.1332
tests/lib/libc/locale/t_uchar.c: revision 1.1
tests/lib/libc/locale/t_uchar.c: revision 1.2
tests/lib/libc/locale/t_uchar.c: revision 1.3
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.3
include/uchar.h: revision 1.1
include/uchar.h: revision 1.2
include/uchar.h: revision 1.3
include/uchar.h: revision 1.4
include/uchar.h: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.1
include/uchar.h: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.2
tests/lib/libc/locale/t_c8rtomb.c: revision 1.3
tests/lib/libc/locale/t_c8rtomb.c: revision 1.4
share/man/man3/Makefile: revision 1.93
tests/lib/libc/locale/t_c8rtomb.c: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.7
lib/libc/shlib_version: revision 1.297
lib/libc/locale/c16rtomb.3: revision 1.10
lib/libc/locale/c16rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.10
tests/lib/libc/locale/Makefile: revision 1.15
tests/lib/libc/locale/Makefile: revision 1.16
tests/lib/libc/locale/Makefile: revision 1.17
tests/lib/libc/locale/Makefile: revision 1.18
distrib/sets/lists/debug/mi: revision 1.442
distrib/sets/lists/debug/mi: revision 1.443
distrib/sets/lists/debug/mi: revision 1.444
lib/libc/locale/c8rtomb.3: revision 1.1
lib/libc/locale/c8rtomb.c: revision 1.1
lib/libc/locale/c8rtomb.3: revision 1.2
lib/libc/locale/c8rtomb.c: revision 1.2
lib/libc/locale/c8rtomb.3: revision 1.3
lib/libc/locale/c8rtomb.c: revision 1.3
lib/libc/locale/c8rtomb.3: revision 1.4
lib/libc/locale/c8rtomb.c: revision 1.4
lib/libc/locale/c8rtomb.3: revision 1.5
lib/libc/locale/c8rtomb.c: revision 1.5
lib/libc/locale/c8rtomb.3: revision 1.6
lib/libc/locale/c8rtomb.c: revision 1.6
lib/libc/locale/c8rtomb.3: revision 1.7
lib/libc/locale/c8rtomb.3: revision 1.8
lib/libc/locale/c8rtomb.3: revision 1.9
lib/libc/locale/mbrtoc32.h: revision 1.1
lib/libc/locale/mbrtoc32.h: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.1
lib/libc/locale/mbrtoc8.3: revision 1.1
lib/libc/locale/mbrtoc8.c: revision 1.2
lib/libc/locale/mbrtoc8.3: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc8.3: revision 1.3
lib/libc/locale/mbrtoc8.c: revision 1.4
lib/libc/locale/mbrtoc8.3: revision 1.4
lib/libc/locale/Makefile.inc: revision 1.66
lib/libc/locale/mbrtoc8.c: revision 1.5
lib/libc/locale/mbrtoc8.3: revision 1.5
lib/libc/locale/Makefile.inc: revision 1.67
lib/libc/locale/mbrtoc8.c: revision 1.6
lib/libc/locale/mbrtoc8.3: revision 1.6
lib/libc/locale/mbrtoc8.c: revision 1.7
lib/libc/locale/mbrtoc8.3: revision 1.7
lib/libc/locale/mbrtoc8.c: revision 1.8
lib/libc/locale/c32rtomb.3: revision 1.1
lib/libc/locale/c32rtomb.c: revision 1.1
lib/libc/locale/c32rtomb.3: revision 1.2
lib/libc/locale/c32rtomb.c: revision 1.2
lib/libc/locale/c32rtomb.3: revision 1.3
lib/libc/locale/c32rtomb.c: revision 1.3
lib/libc/locale/c32rtomb.3: revision 1.4
lib/libc/locale/c32rtomb.c: revision 1.4
lib/libc/locale/c32rtomb.3: revision 1.5
lib/libc/locale/c32rtomb.c: revision 1.5
lib/libc/locale/c32rtomb.3: revision 1.6
lib/libc/locale/c32rtomb.c: revision 1.6
lib/libc/locale/c32rtomb.3: revision 1.7
lib/libc/locale/c32rtomb.3: revision 1.8

(all via patch)


tests/lib/libc/locale/Makefile: Sort.
No functional change intended.
Preparation for PR lib/52374.

uchar.h: New header file for C11 (and C++11) compliance.

Implementation of the new functions mbrtoc16, c16rtomb, mbrtoc32, and
c32rtomb to come later. Updates for C23 to come later.
PR lib/52374: <uchar.h> missing

libc: New C11 functions mbrtoc16, mbrtoc32, c16rtomb, c32rtomb.

The mbrtoc16/32 functions read mulitbyte strings according to the
current locale into UTF-16/32 code unit sequences; the c16/32rtomb
functions write UTF-16/32 code unit sequences into multibyte strings
according to the current locale. The `r' means restartable: they
work incrementally and pick up where they left off.

NOTE: This bumps the libc minor version, since it adds new symbols.

PR lib/52374: <uchar.h> missing
mbrtoc16(3), mbrtoc32(3): Fix \n in man page examples.
Need to write \en to pacify roff.
PR lib/52374: <uchar.h> missing

c16rtomb(3), c32rtomb(3): Fix more \n in man pages.
Also, tighten an assertion: we left room for a NUL byte at the end.
PR lib/52374: <uchar.h> missing

libc: Use the more idiomatic alignof from stdalign.h.
No functional change intended.
PR lib/52374: <uchar.h> missing

mbrtoc16(3): Simplify surrogate state test.

Turn the finer-grained test into an assertion.
No semantic change intended: we are supposed to control this state,
and we always arrange it this way. (But in principle this could
change the behaviour of buggy programs that violate the mbstate_t
abstraction.)
PR lib/52374: <uchar.h> missing

libc: New functions c8rtomb(3) and mbrtoc8(3).

New in C23, for converting from UTF-8 to locale-dependent multibyte
sequences (c8rtomb) or vice versa (mbrtoc8), along with the new type
char8_t.

Conditional on either:
- _NETBSD_SOURCE
- _ISOC23_SOURCE
- __STDC_VERSION__ >= 202311L
(Riding the libc minor bump from this morning for the UTF-16/UTF-32
versions from C11.)

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
libc: c32rtomb and mbrtoc32 are used internally, so weak-alias them.
PR lib/52374: <uchar.h> missing
c8rtomb(3), mbrtoc8(3): Use namespace.h to get private aliases.

This way applications defining the symbols c32rtomb or mbrtoc32 won't
clobber our private definitions, which are slightly more constrained
about their use of mbstate_t than is obvious from the interface
contract.

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
mbrtoc16(3), mbrtoc32(3): brush up markup

Split long .Fn lines into Fo/Fa/Fc. Dont indent the list of return
values. Don't use artisanal -width.

Untabify code examples - indented literal displays don't have correct
tab stops consistent with tab stops in the fixed font code, so the
lines end up misaligned in the PostScript output.

c16rtomb(3), c32rtomb(3): brush up markup

mbrtoc16(3), mbrtoc32(3): Simplify return value language.
Also expand BMP only once.
PR lib/52374: <uchar.h> missing

mbrtoc16(3), mbrtoc32(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc32(3): Clarify control flow.
No need for another goto here; let's keep it clearly structured with
a single `out' label.
No functional change intended.
PR lib/52374: <uchar.h> missing

c8rtomb(3), mbrtoc8(3): brush up markup

mbrtoc8(3): Simplify return value language.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Specify what happens if ps is null.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Specify what happens when ps is null.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Work on deturgidifying prose.
Still maybe not great but at least there's less jargon in most of the
text, without really losing any content.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Work on deturgidifying prose.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Restore word accidentally removed.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Restore word accidentally removed.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c8rtomb(3): Fix possible error descriptions.
The argument c8 can't be a surrogate code point itself (they're in
the range [0xd800,0xdfff], beyond 8-bit values), but the bits of a
surrogate code point could be forced into the UTF-8 format, which is
also invalid.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Attempt a deturgidification pass.
Limit the jargon around surrogates.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Clarify prose and fix example in caveat.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
c16rtomb(3), c32rtomb(3), mbrtoc16(3), mbrtoc32(3): xref c8 versions.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

pass lint, XXX see lint bug.

libc: Add _l variants of the cNrtomb and mbrtocN functions.
These accept an explicit locale parameter, rather than using the
current locale.
Visible under _NETBSD_SOURCE, not exposed otherwise.
NOTE: This adds libc symbols. Riding the libc minor bump for the
non-_l variants of these from two days ago -- hope that's not pushing
it too far.
PR lib/58613: c*rtomb, mbrtoc* should have locale-parametric _l
variants

c8rtomb(3), c16rtomb(3): Add tests for incomplete NUL termination.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3): Fix NUL handling.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3), c32rtomb(3): Test stateful shift sequences.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Fix digit error in shift sequence test.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Nix __CTASSERT after case label.
I put this in to make it (machine-verifiably) clear that zeroing the
state is the same as returning to the initial conversion state, as
the standard requires, but this is causing build trouble (and will
likely cause more trouble if pulled up) because some definitions of
__CTASSERT make a declaration which is forbidden after a label, so
let's remove it.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8(3): Fix pasto in comment at top.
No functional change intended.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8: remove lint-specific workarounds
No binary change.

mbrtoc8: fix comments

mbrtoc16, mbrtoc32: fix comments, remove lint-specific workarounds
No binary change.
t_c8rtomb, t_c16rtomb: Simplify comment.
ESC $ B is technically rather the JIS X 0208-1983 shift sequence, but
since I don't see any way to provoke the JIS X 0208-1978 shift
sequence to come flying out of this conversion (ESC $ @), and I'm not
sure there's any difference in the interpretation, let's just say JIS
X 0208.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c32rtomb(3): Use conversion state to handle shift sequences.
For conversion of Unicode scalar values to coding systems requiring
shift sequences, such as ISO-2022-JP, _citrus_iconv_convert will
always produce:
1. a shift sequence from the initial state to some nondefault state,
like from US-ASCII to JIS X 0208
2. the encoding of the desired characater
3. a shift sequence restoring the initial state
This is unnecessary if the output is already in the state needed to
encoded the desired character. For example, this method produces
seven bytes to encode each YEN SIGN in ISO-2022-JP -- and fourteen,
to encode two consecutive ones -- even though the shift sequence is
only three bytes long and once shifted YEN SIGN takes only one byte.
Instead, convert the Unicode scalar value to a locale-dependent wide
character and encode that, by composing
- _citrus_iconv_convert
=> gives us a multibyte encoding of the character from the initial
state (and restoring the initial state afterward)
- mbrtowc with initial conversion state
=> gives us the single wide character representation
XXX If combining characters are possible here, this may fail.
- wcrtomb with caller's conversion tsate
=> gives us a state-dependent multibyte encoding of the character
XXX Is there a cheaper way to convert from Unicode scalar value to
locale-dependent wide character? It is not obvious to me from the
largely undocumented Citrus machinery, but it would obviously be
better than this somewhat circuitous Rube Goldberg contraption of
chained multibyte APIs.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

mbrtoc8(3), mbrtoc16(3): Test consuming shift sequences with state.
This has the side effect of testing mbrtoc32(3) because they are both
defined in terms of it.
PR lib/58618: mbrtocN(3) fails to keep shift state

c8rtomb(3), c16rtomb(3), c32rtomb(3): Suggest MB_LEN_MAX in example.
This way it avoids variable-length arrays, by always allocating the
maximum space that could be occupied by MB_CUR_MAX.

mbrtoc32(3): Use conversion state to handle shift sequences.
PR lib/58618: mbrtocN(3) fails to keep shift state

mbrtoc32(3): Fix name and type of mbrtowc_l return value.
This was from `int mbtowc_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to mbrtowc_l. Caught by
lint.
`mb_len' avoids (harmless) clash with standard C function mblen(3).
PR lib/58618: mbrtocN(3) fails to keep shift state

c32rtomb(3): Fix type of wcrtomb_l return value.
This was from `int wctomb_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to wcrtomb_l. Caught by
lint.
`wc_len' mirrors `mb_len' in the complementary code in mbrtoc32(3) to
avoid clash with standard C function mblen(3).
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3), c16rtomb(3), c32rtomb(3): Attempt to simplify language.

c8rtomb(3), c16rtomb(3), c32rtomb(3): Fix null string output case.
This ignores c8/c16/c32, produces no output anywhere, and just resets
ps to the initial conversion state.
Also just use 0 in the example, not '\0' or L'\0'. This works for
C11, which prefers '\0' and L'\0', for and C23, which introduced the
new u8'\0', u'\0' (UTF-16), and U'\0' (UTF-32).
c16rtomb, c32rtomb, mbrtoc8: fix page numbers in comments
mbrtoc8(3), mbrtoc16(3), mbrtoc32(3): Say 0 for zero code unit.
Rather than deal with differences between C11 and C23 in notation,
'\0' vs L'\0' vs u8'\0' vs u'\0' vs U'\0'.
uchar.h: Include <sys/featuretest.h> before testing _*_SOURCE.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

uchar.h: Need <sys/cdefs.h> for __restrict.
PR lib/52374: <uchar.h> missing

uchar.h: Simplify __cpp_char8_t and __cplusplus conditionals.
No functional change intended.
PR lib/52374: <uchar.h> missing

tests/lib/libc/locale/t_uchar: Test for char8_t, mbrtoc8, c8rtomb.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

tests/t_uchar: fix copy-and-paste typo
 1.1.2.1 15-Aug-2024  martin file t_mbrtoc32.c was added on branch netbsd-10 on 2024-10-14 17:20:19 +0000
 1.3 20-Aug-2024  riastradh branches: 1.3.2; 1.3.6;
mbrtoc32(3): Use conversion state to handle shift sequences.

PR lib/58618: mbrtocN(3) fails to keep shift state
 1.2 19-Aug-2024  riastradh mbrtoc8(3), mbrtoc16(3): Test consuming shift sequences with state.

This has the side effect of testing mbrtoc32(3) because they are both
defined in terms of it.

PR lib/58618: mbrtocN(3) fails to keep shift state
 1.1 15-Aug-2024  riastradh libc: New functions c8rtomb(3) and mbrtoc8(3).

New in C23, for converting from UTF-8 to locale-dependent multibyte
sequences (c8rtomb) or vice versa (mbrtoc8), along with the new type
char8_t.

Conditional on either:
- _NETBSD_SOURCE
- _ISOC23_SOURCE
- __STDC_VERSION__ >= 202311L

(Riding the libc minor bump from this morning for the UTF-16/UTF-32
versions from C11.)

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
 1.3.6.2 02-Aug-2025  perseant Sync with HEAD
 1.3.6.1 20-Aug-2024  perseant file t_mbrtoc8.c was added on branch perseant-exfatfs on 2025-08-02 05:58:05 +0000
 1.3.2.2 14-Oct-2024  martin Pull up following revision(s) (requested by riastradh in ticket #976):

lib/libc/locale/c32rtomb.3: revision 1.10
lib/libc/locale/c32rtomb.3: revision 1.9
lib/libc/locale/c32rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc32.c: revision 1.1
distrib/sets/lists/base/shl.mi: revision 1.988
lib/libc/include/namespace.h: revision 1.204
lib/libc/include/namespace.h: revision 1.205
lib/libc/locale/mbrtoc16.3: revision 1.1
lib/libc/locale/mbrtoc16.c: revision 1.1
lib/libc/locale/mbrtoc16.3: revision 1.2
lib/libc/locale/mbrtoc16.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.3
lib/libc/locale/mbrtoc16.c: revision 1.3
lib/libc/locale/mbrtoc32.3: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.1
tests/lib/libc/locale/t_c16rtomb.c: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.4
lib/libc/locale/mbrtoc16.c: revision 1.4
lib/libc/locale/mbrtoc32.3: revision 1.2
tests/lib/libc/locale/t_c16rtomb.c: revision 1.2
lib/libc/locale/mbrtoc32.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.5
lib/libc/locale/mbrtoc16.c: revision 1.5
lib/libc/locale/mbrtoc32.3: revision 1.3
tests/lib/libc/locale/t_c16rtomb.c: revision 1.3
lib/libc/locale/mbrtoc32.c: revision 1.4
lib/libc/locale/mbrtoc16.3: revision 1.6
lib/libc/locale/mbrtoc16.c: revision 1.6
lib/libc/locale/mbrtoc32.3: revision 1.4
tests/lib/libc/locale/t_c16rtomb.c: revision 1.4
lib/libc/locale/mbrtoc32.c: revision 1.5
lib/libc/locale/mbrtoc16.3: revision 1.7
lib/libc/locale/mbrtoc16.c: revision 1.7
lib/libc/locale/mbrtoc32.3: revision 1.5
tests/lib/libc/locale/t_c16rtomb.c: revision 1.5
lib/libc/locale/mbrtoc32.c: revision 1.6
lib/libc/locale/mbrtoc16.3: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.6
tests/lib/libc/locale/t_c16rtomb.c: revision 1.6
lib/libc/locale/mbrtoc32.c: revision 1.7
lib/libc/locale/mbrtoc16.3: revision 1.9
lib/libc/locale/mbrtoc32.3: revision 1.7
lib/libc/locale/mbrtoc32.c: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.8
lib/libc/locale/mbrtoc32.c: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2468
lib/libc/locale/mbrtoc32.3: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2469
lib/libc/locale/c32rtomb.h: revision 1.1
lib/libc/locale/c32rtomb.h: revision 1.2
include/Makefile: revision 1.147
share/man/man3/uchar.3: revision 1.1
share/man/man3/uchar.3: revision 1.2
tests/lib/libc/locale/t_c32rtomb.c: revision 1.1
distrib/sets/lists/comp/mi: revision 1.2470
lib/libc/locale/c16rtomb.3: revision 1.1
lib/libc/locale/c16rtomb.c: revision 1.1
lib/libc/locale/c16rtomb.3: revision 1.2
lib/libc/locale/c16rtomb.c: revision 1.2
lib/libc/locale/c16rtomb.3: revision 1.3
lib/libc/locale/c16rtomb.c: revision 1.3
lib/libc/locale/c16rtomb.3: revision 1.4
lib/libc/locale/c16rtomb.c: revision 1.4
lib/libc/locale/c16rtomb.3: revision 1.5
lib/libc/locale/c16rtomb.c: revision 1.5
lib/libc/locale/c16rtomb.3: revision 1.6
lib/libc/locale/c16rtomb.c: revision 1.6
lib/libc/locale/c16rtomb.3: revision 1.7
lib/libc/locale/c16rtomb.c: revision 1.7
lib/libc/locale/c16rtomb.3: revision 1.8
lib/libc/locale/c16rtomb.3: revision 1.9
distrib/sets/lists/tests/mi: revision 1.1330
distrib/sets/lists/tests/mi: revision 1.1331
distrib/sets/lists/tests/mi: revision 1.1332
tests/lib/libc/locale/t_uchar.c: revision 1.1
tests/lib/libc/locale/t_uchar.c: revision 1.2
tests/lib/libc/locale/t_uchar.c: revision 1.3
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.3
include/uchar.h: revision 1.1
include/uchar.h: revision 1.2
include/uchar.h: revision 1.3
include/uchar.h: revision 1.4
include/uchar.h: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.1
include/uchar.h: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.2
tests/lib/libc/locale/t_c8rtomb.c: revision 1.3
tests/lib/libc/locale/t_c8rtomb.c: revision 1.4
share/man/man3/Makefile: revision 1.93
tests/lib/libc/locale/t_c8rtomb.c: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.7
lib/libc/shlib_version: revision 1.297
lib/libc/locale/c16rtomb.3: revision 1.10
lib/libc/locale/c16rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.10
tests/lib/libc/locale/Makefile: revision 1.15
tests/lib/libc/locale/Makefile: revision 1.16
tests/lib/libc/locale/Makefile: revision 1.17
tests/lib/libc/locale/Makefile: revision 1.18
distrib/sets/lists/debug/mi: revision 1.442
distrib/sets/lists/debug/mi: revision 1.443
distrib/sets/lists/debug/mi: revision 1.444
lib/libc/locale/c8rtomb.3: revision 1.1
lib/libc/locale/c8rtomb.c: revision 1.1
lib/libc/locale/c8rtomb.3: revision 1.2
lib/libc/locale/c8rtomb.c: revision 1.2
lib/libc/locale/c8rtomb.3: revision 1.3
lib/libc/locale/c8rtomb.c: revision 1.3
lib/libc/locale/c8rtomb.3: revision 1.4
lib/libc/locale/c8rtomb.c: revision 1.4
lib/libc/locale/c8rtomb.3: revision 1.5
lib/libc/locale/c8rtomb.c: revision 1.5
lib/libc/locale/c8rtomb.3: revision 1.6
lib/libc/locale/c8rtomb.c: revision 1.6
lib/libc/locale/c8rtomb.3: revision 1.7
lib/libc/locale/c8rtomb.3: revision 1.8
lib/libc/locale/c8rtomb.3: revision 1.9
lib/libc/locale/mbrtoc32.h: revision 1.1
lib/libc/locale/mbrtoc32.h: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.1
lib/libc/locale/mbrtoc8.3: revision 1.1
lib/libc/locale/mbrtoc8.c: revision 1.2
lib/libc/locale/mbrtoc8.3: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc8.3: revision 1.3
lib/libc/locale/mbrtoc8.c: revision 1.4
lib/libc/locale/mbrtoc8.3: revision 1.4
lib/libc/locale/Makefile.inc: revision 1.66
lib/libc/locale/mbrtoc8.c: revision 1.5
lib/libc/locale/mbrtoc8.3: revision 1.5
lib/libc/locale/Makefile.inc: revision 1.67
lib/libc/locale/mbrtoc8.c: revision 1.6
lib/libc/locale/mbrtoc8.3: revision 1.6
lib/libc/locale/mbrtoc8.c: revision 1.7
lib/libc/locale/mbrtoc8.3: revision 1.7
lib/libc/locale/mbrtoc8.c: revision 1.8
lib/libc/locale/c32rtomb.3: revision 1.1
lib/libc/locale/c32rtomb.c: revision 1.1
lib/libc/locale/c32rtomb.3: revision 1.2
lib/libc/locale/c32rtomb.c: revision 1.2
lib/libc/locale/c32rtomb.3: revision 1.3
lib/libc/locale/c32rtomb.c: revision 1.3
lib/libc/locale/c32rtomb.3: revision 1.4
lib/libc/locale/c32rtomb.c: revision 1.4
lib/libc/locale/c32rtomb.3: revision 1.5
lib/libc/locale/c32rtomb.c: revision 1.5
lib/libc/locale/c32rtomb.3: revision 1.6
lib/libc/locale/c32rtomb.c: revision 1.6
lib/libc/locale/c32rtomb.3: revision 1.7
lib/libc/locale/c32rtomb.3: revision 1.8

(all via patch)


tests/lib/libc/locale/Makefile: Sort.
No functional change intended.
Preparation for PR lib/52374.

uchar.h: New header file for C11 (and C++11) compliance.

Implementation of the new functions mbrtoc16, c16rtomb, mbrtoc32, and
c32rtomb to come later. Updates for C23 to come later.
PR lib/52374: <uchar.h> missing

libc: New C11 functions mbrtoc16, mbrtoc32, c16rtomb, c32rtomb.

The mbrtoc16/32 functions read mulitbyte strings according to the
current locale into UTF-16/32 code unit sequences; the c16/32rtomb
functions write UTF-16/32 code unit sequences into multibyte strings
according to the current locale. The `r' means restartable: they
work incrementally and pick up where they left off.

NOTE: This bumps the libc minor version, since it adds new symbols.

PR lib/52374: <uchar.h> missing
mbrtoc16(3), mbrtoc32(3): Fix \n in man page examples.
Need to write \en to pacify roff.
PR lib/52374: <uchar.h> missing

c16rtomb(3), c32rtomb(3): Fix more \n in man pages.
Also, tighten an assertion: we left room for a NUL byte at the end.
PR lib/52374: <uchar.h> missing

libc: Use the more idiomatic alignof from stdalign.h.
No functional change intended.
PR lib/52374: <uchar.h> missing

mbrtoc16(3): Simplify surrogate state test.

Turn the finer-grained test into an assertion.
No semantic change intended: we are supposed to control this state,
and we always arrange it this way. (But in principle this could
change the behaviour of buggy programs that violate the mbstate_t
abstraction.)
PR lib/52374: <uchar.h> missing

libc: New functions c8rtomb(3) and mbrtoc8(3).

New in C23, for converting from UTF-8 to locale-dependent multibyte
sequences (c8rtomb) or vice versa (mbrtoc8), along with the new type
char8_t.

Conditional on either:
- _NETBSD_SOURCE
- _ISOC23_SOURCE
- __STDC_VERSION__ >= 202311L
(Riding the libc minor bump from this morning for the UTF-16/UTF-32
versions from C11.)

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
libc: c32rtomb and mbrtoc32 are used internally, so weak-alias them.
PR lib/52374: <uchar.h> missing
c8rtomb(3), mbrtoc8(3): Use namespace.h to get private aliases.

This way applications defining the symbols c32rtomb or mbrtoc32 won't
clobber our private definitions, which are slightly more constrained
about their use of mbstate_t than is obvious from the interface
contract.

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
mbrtoc16(3), mbrtoc32(3): brush up markup

Split long .Fn lines into Fo/Fa/Fc. Dont indent the list of return
values. Don't use artisanal -width.

Untabify code examples - indented literal displays don't have correct
tab stops consistent with tab stops in the fixed font code, so the
lines end up misaligned in the PostScript output.

c16rtomb(3), c32rtomb(3): brush up markup

mbrtoc16(3), mbrtoc32(3): Simplify return value language.
Also expand BMP only once.
PR lib/52374: <uchar.h> missing

mbrtoc16(3), mbrtoc32(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc32(3): Clarify control flow.
No need for another goto here; let's keep it clearly structured with
a single `out' label.
No functional change intended.
PR lib/52374: <uchar.h> missing

c8rtomb(3), mbrtoc8(3): brush up markup

mbrtoc8(3): Simplify return value language.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Specify what happens if ps is null.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Specify what happens when ps is null.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Work on deturgidifying prose.
Still maybe not great but at least there's less jargon in most of the
text, without really losing any content.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Work on deturgidifying prose.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Restore word accidentally removed.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Restore word accidentally removed.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c8rtomb(3): Fix possible error descriptions.
The argument c8 can't be a surrogate code point itself (they're in
the range [0xd800,0xdfff], beyond 8-bit values), but the bits of a
surrogate code point could be forced into the UTF-8 format, which is
also invalid.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Attempt a deturgidification pass.
Limit the jargon around surrogates.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Clarify prose and fix example in caveat.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
c16rtomb(3), c32rtomb(3), mbrtoc16(3), mbrtoc32(3): xref c8 versions.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

pass lint, XXX see lint bug.

libc: Add _l variants of the cNrtomb and mbrtocN functions.
These accept an explicit locale parameter, rather than using the
current locale.
Visible under _NETBSD_SOURCE, not exposed otherwise.
NOTE: This adds libc symbols. Riding the libc minor bump for the
non-_l variants of these from two days ago -- hope that's not pushing
it too far.
PR lib/58613: c*rtomb, mbrtoc* should have locale-parametric _l
variants

c8rtomb(3), c16rtomb(3): Add tests for incomplete NUL termination.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3): Fix NUL handling.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3), c32rtomb(3): Test stateful shift sequences.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Fix digit error in shift sequence test.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Nix __CTASSERT after case label.
I put this in to make it (machine-verifiably) clear that zeroing the
state is the same as returning to the initial conversion state, as
the standard requires, but this is causing build trouble (and will
likely cause more trouble if pulled up) because some definitions of
__CTASSERT make a declaration which is forbidden after a label, so
let's remove it.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8(3): Fix pasto in comment at top.
No functional change intended.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8: remove lint-specific workarounds
No binary change.

mbrtoc8: fix comments

mbrtoc16, mbrtoc32: fix comments, remove lint-specific workarounds
No binary change.
t_c8rtomb, t_c16rtomb: Simplify comment.
ESC $ B is technically rather the JIS X 0208-1983 shift sequence, but
since I don't see any way to provoke the JIS X 0208-1978 shift
sequence to come flying out of this conversion (ESC $ @), and I'm not
sure there's any difference in the interpretation, let's just say JIS
X 0208.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c32rtomb(3): Use conversion state to handle shift sequences.
For conversion of Unicode scalar values to coding systems requiring
shift sequences, such as ISO-2022-JP, _citrus_iconv_convert will
always produce:
1. a shift sequence from the initial state to some nondefault state,
like from US-ASCII to JIS X 0208
2. the encoding of the desired characater
3. a shift sequence restoring the initial state
This is unnecessary if the output is already in the state needed to
encoded the desired character. For example, this method produces
seven bytes to encode each YEN SIGN in ISO-2022-JP -- and fourteen,
to encode two consecutive ones -- even though the shift sequence is
only three bytes long and once shifted YEN SIGN takes only one byte.
Instead, convert the Unicode scalar value to a locale-dependent wide
character and encode that, by composing
- _citrus_iconv_convert
=> gives us a multibyte encoding of the character from the initial
state (and restoring the initial state afterward)
- mbrtowc with initial conversion state
=> gives us the single wide character representation
XXX If combining characters are possible here, this may fail.
- wcrtomb with caller's conversion tsate
=> gives us a state-dependent multibyte encoding of the character
XXX Is there a cheaper way to convert from Unicode scalar value to
locale-dependent wide character? It is not obvious to me from the
largely undocumented Citrus machinery, but it would obviously be
better than this somewhat circuitous Rube Goldberg contraption of
chained multibyte APIs.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

mbrtoc8(3), mbrtoc16(3): Test consuming shift sequences with state.
This has the side effect of testing mbrtoc32(3) because they are both
defined in terms of it.
PR lib/58618: mbrtocN(3) fails to keep shift state

c8rtomb(3), c16rtomb(3), c32rtomb(3): Suggest MB_LEN_MAX in example.
This way it avoids variable-length arrays, by always allocating the
maximum space that could be occupied by MB_CUR_MAX.

mbrtoc32(3): Use conversion state to handle shift sequences.
PR lib/58618: mbrtocN(3) fails to keep shift state

mbrtoc32(3): Fix name and type of mbrtowc_l return value.
This was from `int mbtowc_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to mbrtowc_l. Caught by
lint.
`mb_len' avoids (harmless) clash with standard C function mblen(3).
PR lib/58618: mbrtocN(3) fails to keep shift state

c32rtomb(3): Fix type of wcrtomb_l return value.
This was from `int wctomb_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to wcrtomb_l. Caught by
lint.
`wc_len' mirrors `mb_len' in the complementary code in mbrtoc32(3) to
avoid clash with standard C function mblen(3).
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3), c16rtomb(3), c32rtomb(3): Attempt to simplify language.

c8rtomb(3), c16rtomb(3), c32rtomb(3): Fix null string output case.
This ignores c8/c16/c32, produces no output anywhere, and just resets
ps to the initial conversion state.
Also just use 0 in the example, not '\0' or L'\0'. This works for
C11, which prefers '\0' and L'\0', for and C23, which introduced the
new u8'\0', u'\0' (UTF-16), and U'\0' (UTF-32).
c16rtomb, c32rtomb, mbrtoc8: fix page numbers in comments
mbrtoc8(3), mbrtoc16(3), mbrtoc32(3): Say 0 for zero code unit.
Rather than deal with differences between C11 and C23 in notation,
'\0' vs L'\0' vs u8'\0' vs u'\0' vs U'\0'.
uchar.h: Include <sys/featuretest.h> before testing _*_SOURCE.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

uchar.h: Need <sys/cdefs.h> for __restrict.
PR lib/52374: <uchar.h> missing

uchar.h: Simplify __cpp_char8_t and __cplusplus conditionals.
No functional change intended.
PR lib/52374: <uchar.h> missing

tests/lib/libc/locale/t_uchar: Test for char8_t, mbrtoc8, c8rtomb.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

tests/t_uchar: fix copy-and-paste typo
 1.3.2.1 20-Aug-2024  martin file t_mbrtoc8.c was added on branch netbsd-10 on 2024-10-14 17:20:19 +0000
 1.2 12-Jul-2017  perseant Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.
 1.1 15-Jul-2011  jruoho branches: 1.1.34;
Rename two test files to get functional scope (and avoid confusion
with ctype(3)). No functional change.
 1.1.34.1 15-Mar-2018  martin Pull up following revision(s) (requested by maya in ticket #608):
tests/lib/libc/locale/t_sprintf.c: revision 1.3
tests/lib/libc/locale/t_wctomb.c: revision 1.5
tests/lib/libc/locale/t_io.c: revision 1.5
tests/lib/libc/locale/t_wcstod.c: revision 1.4
tests/lib/libc/locale/t_mbstowcs.c: revision 1.2
tests/lib/libc/locale/t_wctype.c: revision 1.2
tests/lib/libc/locale/t_mbrtowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.3
Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.

Separate the C/POSIX locale test from the rest; make it more thorough
and more correct. This fixes a problem reported by martin@ when the
test is compiled with -funsigned-char.
 1.2 06-May-2014  yamt branches: 1.2.2;
include string.h for memset
 1.1 28-May-2013  joerg branches: 1.1.2; 1.1.6;
Add mbsnrtowcs and wcsnrtombs. Approved by core.
 1.1.6.1 10-Aug-2014  tls Rebase.
 1.1.2.3 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.1.2.2 23-Jun-2013  tls resync from head
 1.1.2.1 28-May-2013  tls file t_mbsnrtowcs.c was added on branch tls-maxphys on 2013-06-23 06:28:56 +0000
 1.2.2.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.2.2.1 06-May-2014  yamt file t_mbsnrtowcs.c was added on branch yamt-pagecache on 2014-05-22 11:42:20 +0000
 1.3 21-Dec-2022  wiz adapt mbstowcs_basic test for unicode table update

reformat so it's easier to find which result data belongs to which input
 1.2 12-Jul-2017  perseant branches: 1.2.16;
Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.
 1.1 15-Jul-2011  jruoho branches: 1.1.34;
Rename two test files to get functional scope (and avoid confusion
with ctype(3)). No functional change.
 1.1.34.1 15-Mar-2018  martin Pull up following revision(s) (requested by maya in ticket #608):
tests/lib/libc/locale/t_sprintf.c: revision 1.3
tests/lib/libc/locale/t_wctomb.c: revision 1.5
tests/lib/libc/locale/t_io.c: revision 1.5
tests/lib/libc/locale/t_wcstod.c: revision 1.4
tests/lib/libc/locale/t_mbstowcs.c: revision 1.2
tests/lib/libc/locale/t_wctype.c: revision 1.2
tests/lib/libc/locale/t_mbrtowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.3
Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.

Separate the C/POSIX locale test from the rest; make it more thorough
and more correct. This fixes a problem reported by martin@ when the
test is compiled with -funsigned-char.
 1.2.16.1 11-Sep-2023  martin Pull up following revision(s) (requested by wiz in ticket #368):

share/locale/ctype/en_US.UTF-8.src: revision 1.10
share/locale/ctype/en_US.UTF-8.src: revision 1.8
share/locale/ctype/en_US.UTF-8.src: revision 1.9
share/locale/ctype/gen_ctype_utf8.pl: revision 1.1
share/locale/ctype/gen_ctype_utf8.pl: revision 1.2
tests/lib/libc/locale/t_mbstowcs.c: revision 1.3

Update unicode tables.

This version of the file, and the generator script, come from
OpenBSD. The script was written by Andrew Fresh.
The file covers the encodings from Unicode 13.0.0, based on the files
distributed with perl 5.32.1.

Add NetBSD RCS Id header instead of OpenBSD one.

Update Unicode tables.

These tables are for Unicode 14.0.0 using the data provided with
perl 5.36.0.

Update Unicode tables to 15.0.0.
This is based on the tables provided by perl 5.37.7.

adapt mbstowcs_basic test for unicode table update
reformat so it's easier to find which result data belongs to which input
 1.3 30-Jun-2020  jruoho After a comedy of errors, move t_mbtowc to its final resting place.
 1.2 25-May-2017  perseant Add a member to the test data structure that indicates whether the given
encoding is state-dependent, and test the results of wctomb(NULL, '\0') and
mbtowc(NULL, NULL, 0) against this instead of against each other.
 1.1 09-Apr-2011  pgoyette atf-ify the various locale tests
 1.8 02-Aug-2021  andvar s/diferent/different/
 1.7 01-Dec-2017  kre Since the C standard allows for intermediate floating results to contain
more precision bits than the data type expects, but (kind of obviously)
does not allow such values to be stored in memory, expecting the value
returned from strtod() (an intermediate result) to be identical (that is,
equal) to a stored value is incorrect.

So instead go back to checking that the two numbers are very very close.
See comments added to the test for more explanation.
 1.6 28-Nov-2017  kre Revert 1.4 (perhaps temporarily) and add even more diagnostics to those
added in 1.3 to see if it is possible to determine why the strict equality
test fails on i386, yet succeeds elsewhere.
 1.5 24-Nov-2017  kre When comparing doubles (any floating point values) which have been
computed using different methods, don't expect to achieve identical
results (here, one constant is perhaps converted to binary from a string by
a cross compiler, the other is converted at run time). Allow them to
have a small difference (for now, small is < 1e-7 - the constant is ~ 1e5,
so this is 12 orders of magnitude less) before failing (and include the
actual difference in the error message if it does fail.)
 1.4 23-Nov-2017  kre Add some diagnostics to the strto test, so I can see why this
fails on i386 (on qemu) - will probably keep them when done.
 1.3 12-Jul-2017  perseant Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.
 1.2 07-Jun-2017  perseant Change t_sprintf to an expected failure, since we don't respect the empty
thousands separator of the C/POSIX locale (PR standards/52282).
 1.1 30-May-2017  perseant branches: 1.1.2;
Add test cases for sprintf/sscanf/strto{d,l} and the is* and isw* ctype functions, for single-byte encodings
 1.1.2.3 15-Mar-2018  bouyer Pull up following revision(s) (requested by martin in ticket #631):
tests/lib/libc/locale/t_sprintf.c: revision 1.4
tests/lib/libc/locale/t_sprintf.c: revision 1.5
tests/lib/libc/locale/t_sprintf.c: revision 1.6
tests/lib/libc/locale/t_sprintf.c: revision 1.7
Add some diagnostics to the strto test, so I can see why this
fails on i386 (on qemu) - will probably keep them when done.
When comparing doubles (any floating point values) which have been
computed using different methods, don't expect to achieve identical
results (here, one constant is perhaps converted to binary from a string by
a cross compiler, the other is converted at run time). Allow them to
have a small difference (for now, small is < 1e-7 - the constant is ~ 1e5,
so this is 12 orders of magnitude less) before failing (and include the
actual difference in the error message if it does fail.)
Revert 1.4 (perhaps temporarily) and add even more diagnostics to those
added in 1.3 to see if it is possible to determine why the strict equality
test fails on i386, yet succeeds elsewhere.
Since the C standard allows for intermediate floating results to contain
more precision bits than the data type expects, but (kind of obviously)
does not allow such values to be stored in memory, expecting the value
returned from strtod() (an intermediate result) to be identical (that is,
equal) to a stored value is incorrect.
So instead go back to checking that the two numbers are very very close.
See comments added to the test for more explanation.
 1.1.2.2 15-Mar-2018  martin Pull up following revision(s) (requested by maya in ticket #608):
tests/lib/libc/locale/t_sprintf.c: revision 1.3
tests/lib/libc/locale/t_wctomb.c: revision 1.5
tests/lib/libc/locale/t_io.c: revision 1.5
tests/lib/libc/locale/t_wcstod.c: revision 1.4
tests/lib/libc/locale/t_mbstowcs.c: revision 1.2
tests/lib/libc/locale/t_wctype.c: revision 1.2
tests/lib/libc/locale/t_mbrtowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.3
Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.

Separate the C/POSIX locale test from the rest; make it more thorough
and more correct. This fixes a problem reported by martin@ when the
test is compiled with -funsigned-char.
 1.1.2.1 14-Mar-2018  bouyer Pull up following revision(s) (requested by martin in ticket #630):
lib/libc/stdio/vfwprintf.c: revision 1.35
lib/libc/stdio/vfwprintf.c: revision 1.36
tests/lib/libc/locale/t_sprintf.c: revision 1.2
Change t_sprintf to an expected failure, since we don't respect the empty
thousands separator of the C/POSIX locale (PR standards/52282).
Do not use thousands grouping when none is specified by the locale.
Fixes PR standards/52282.
A more correct fix for PR standards/52282.
 1.6 27-Nov-2023  christos Don't use fmtcheck for strfmon format strings. It does not work. Fix a broken
test.
 1.5 14-Oct-2023  christos PR/57633: Jose Luis Duran: Add strfmon tests from FreeBSD
 1.4 28-Sep-2023  christos Add testing for pad resetting (Jose Luis Duran)
 1.3 02-Aug-2021  andvar s/diferent/different/
 1.2 07-Dec-2017  kre Update this test to expect the output that is supposed to be produced
by strfmon() rather than the output the old buggy implementation used
to produce.
 1.1 16-Aug-2017  joerg branches: 1.1.2;
Add missing strfmon_l. Noticed by Bruno Haible. Add test case.
 1.1.2.2 29-Aug-2017  martin Pull up following revision(s) (requested by joerg in ticket #215):
tests/lib/libc/locale/t_strfmon.c: revision 1.1
tests/lib/libc/locale/Makefile: revision 1.12
lib/libc/stdlib/strfmon.c: revision 1.11
distrib/sets/lists/debug/mi: revision 1.224
include/monetary.h: revision 1.3
distrib/sets/lists/tests/mi: revision 1.761
lib/libc/stdlib/strfmon.3: revision 1.6
lib/libc/stdlib/strfmon.3: revision 1.7
Add missing strfmon_l. Noticed by Bruno Haible. Add test case.
Typo fix.
 1.1.2.1 16-Aug-2017  martin file t_strfmon.c was added on branch netbsd-8 on 2017-08-29 11:51:50 +0000
 1.2 02-Aug-2021  andvar s/diferent/different/
 1.1 30-May-2017  perseant branches: 1.1.4;
Add simple test case for toupper/tolower
 1.1.4.1 23-Jan-2018  perseant Make the tests pass once more when __STDC_ISO_10646__ is not defined.
 1.3 14-Oct-2024  rillig branches: 1.3.2; 1.3.6;
tests/t_uchar: fix copy-and-paste typo
 1.2 13-Oct-2024  riastradh tests/lib/libc/locale/t_uchar: Test for char8_t, mbrtoc8, c8rtomb.

PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h

PR lib/52374: <uchar.h> missing
 1.1 15-Aug-2024  riastradh uchar.h: New header file for C11 (and C++11) compliance.

Implementation of the new functions mbrtoc16, c16rtomb, mbrtoc32, and
c32rtomb to come later. Updates for C23 to come later.

PR lib/52374: <uchar.h> missing
 1.3.6.2 02-Aug-2025  perseant Sync with HEAD
 1.3.6.1 14-Oct-2024  perseant file t_uchar.c was added on branch perseant-exfatfs on 2025-08-02 05:58:05 +0000
 1.3.2.2 14-Oct-2024  martin Pull up following revision(s) (requested by riastradh in ticket #976):

lib/libc/locale/c32rtomb.3: revision 1.10
lib/libc/locale/c32rtomb.3: revision 1.9
lib/libc/locale/c32rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc32.c: revision 1.1
distrib/sets/lists/base/shl.mi: revision 1.988
lib/libc/include/namespace.h: revision 1.204
lib/libc/include/namespace.h: revision 1.205
lib/libc/locale/mbrtoc16.3: revision 1.1
lib/libc/locale/mbrtoc16.c: revision 1.1
lib/libc/locale/mbrtoc16.3: revision 1.2
lib/libc/locale/mbrtoc16.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.3
lib/libc/locale/mbrtoc16.c: revision 1.3
lib/libc/locale/mbrtoc32.3: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.1
tests/lib/libc/locale/t_c16rtomb.c: revision 1.1
lib/libc/locale/mbrtoc32.c: revision 1.2
lib/libc/locale/mbrtoc16.3: revision 1.4
lib/libc/locale/mbrtoc16.c: revision 1.4
lib/libc/locale/mbrtoc32.3: revision 1.2
tests/lib/libc/locale/t_c16rtomb.c: revision 1.2
lib/libc/locale/mbrtoc32.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.5
lib/libc/locale/mbrtoc16.c: revision 1.5
lib/libc/locale/mbrtoc32.3: revision 1.3
tests/lib/libc/locale/t_c16rtomb.c: revision 1.3
lib/libc/locale/mbrtoc32.c: revision 1.4
lib/libc/locale/mbrtoc16.3: revision 1.6
lib/libc/locale/mbrtoc16.c: revision 1.6
lib/libc/locale/mbrtoc32.3: revision 1.4
tests/lib/libc/locale/t_c16rtomb.c: revision 1.4
lib/libc/locale/mbrtoc32.c: revision 1.5
lib/libc/locale/mbrtoc16.3: revision 1.7
lib/libc/locale/mbrtoc16.c: revision 1.7
lib/libc/locale/mbrtoc32.3: revision 1.5
tests/lib/libc/locale/t_c16rtomb.c: revision 1.5
lib/libc/locale/mbrtoc32.c: revision 1.6
lib/libc/locale/mbrtoc16.3: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.6
tests/lib/libc/locale/t_c16rtomb.c: revision 1.6
lib/libc/locale/mbrtoc32.c: revision 1.7
lib/libc/locale/mbrtoc16.3: revision 1.9
lib/libc/locale/mbrtoc32.3: revision 1.7
lib/libc/locale/mbrtoc32.c: revision 1.8
lib/libc/locale/mbrtoc32.3: revision 1.8
lib/libc/locale/mbrtoc32.c: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2468
lib/libc/locale/mbrtoc32.3: revision 1.9
distrib/sets/lists/comp/mi: revision 1.2469
lib/libc/locale/c32rtomb.h: revision 1.1
lib/libc/locale/c32rtomb.h: revision 1.2
include/Makefile: revision 1.147
share/man/man3/uchar.3: revision 1.1
share/man/man3/uchar.3: revision 1.2
tests/lib/libc/locale/t_c32rtomb.c: revision 1.1
distrib/sets/lists/comp/mi: revision 1.2470
lib/libc/locale/c16rtomb.3: revision 1.1
lib/libc/locale/c16rtomb.c: revision 1.1
lib/libc/locale/c16rtomb.3: revision 1.2
lib/libc/locale/c16rtomb.c: revision 1.2
lib/libc/locale/c16rtomb.3: revision 1.3
lib/libc/locale/c16rtomb.c: revision 1.3
lib/libc/locale/c16rtomb.3: revision 1.4
lib/libc/locale/c16rtomb.c: revision 1.4
lib/libc/locale/c16rtomb.3: revision 1.5
lib/libc/locale/c16rtomb.c: revision 1.5
lib/libc/locale/c16rtomb.3: revision 1.6
lib/libc/locale/c16rtomb.c: revision 1.6
lib/libc/locale/c16rtomb.3: revision 1.7
lib/libc/locale/c16rtomb.c: revision 1.7
lib/libc/locale/c16rtomb.3: revision 1.8
lib/libc/locale/c16rtomb.3: revision 1.9
distrib/sets/lists/tests/mi: revision 1.1330
distrib/sets/lists/tests/mi: revision 1.1331
distrib/sets/lists/tests/mi: revision 1.1332
tests/lib/libc/locale/t_uchar.c: revision 1.1
tests/lib/libc/locale/t_uchar.c: revision 1.2
tests/lib/libc/locale/t_uchar.c: revision 1.3
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc16.c: revision 1.3
include/uchar.h: revision 1.1
include/uchar.h: revision 1.2
include/uchar.h: revision 1.3
include/uchar.h: revision 1.4
include/uchar.h: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.1
include/uchar.h: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.2
tests/lib/libc/locale/t_c8rtomb.c: revision 1.3
tests/lib/libc/locale/t_c8rtomb.c: revision 1.4
share/man/man3/Makefile: revision 1.93
tests/lib/libc/locale/t_c8rtomb.c: revision 1.5
tests/lib/libc/locale/t_c8rtomb.c: revision 1.6
tests/lib/libc/locale/t_c8rtomb.c: revision 1.7
lib/libc/shlib_version: revision 1.297
lib/libc/locale/c16rtomb.3: revision 1.10
lib/libc/locale/c16rtomb.3: revision 1.11
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.1
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.2
tests/lib/libc/locale/t_mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc16.3: revision 1.10
tests/lib/libc/locale/Makefile: revision 1.15
tests/lib/libc/locale/Makefile: revision 1.16
tests/lib/libc/locale/Makefile: revision 1.17
tests/lib/libc/locale/Makefile: revision 1.18
distrib/sets/lists/debug/mi: revision 1.442
distrib/sets/lists/debug/mi: revision 1.443
distrib/sets/lists/debug/mi: revision 1.444
lib/libc/locale/c8rtomb.3: revision 1.1
lib/libc/locale/c8rtomb.c: revision 1.1
lib/libc/locale/c8rtomb.3: revision 1.2
lib/libc/locale/c8rtomb.c: revision 1.2
lib/libc/locale/c8rtomb.3: revision 1.3
lib/libc/locale/c8rtomb.c: revision 1.3
lib/libc/locale/c8rtomb.3: revision 1.4
lib/libc/locale/c8rtomb.c: revision 1.4
lib/libc/locale/c8rtomb.3: revision 1.5
lib/libc/locale/c8rtomb.c: revision 1.5
lib/libc/locale/c8rtomb.3: revision 1.6
lib/libc/locale/c8rtomb.c: revision 1.6
lib/libc/locale/c8rtomb.3: revision 1.7
lib/libc/locale/c8rtomb.3: revision 1.8
lib/libc/locale/c8rtomb.3: revision 1.9
lib/libc/locale/mbrtoc32.h: revision 1.1
lib/libc/locale/mbrtoc32.h: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.1
lib/libc/locale/mbrtoc8.3: revision 1.1
lib/libc/locale/mbrtoc8.c: revision 1.2
lib/libc/locale/mbrtoc8.3: revision 1.2
lib/libc/locale/mbrtoc8.c: revision 1.3
lib/libc/locale/mbrtoc8.3: revision 1.3
lib/libc/locale/mbrtoc8.c: revision 1.4
lib/libc/locale/mbrtoc8.3: revision 1.4
lib/libc/locale/Makefile.inc: revision 1.66
lib/libc/locale/mbrtoc8.c: revision 1.5
lib/libc/locale/mbrtoc8.3: revision 1.5
lib/libc/locale/Makefile.inc: revision 1.67
lib/libc/locale/mbrtoc8.c: revision 1.6
lib/libc/locale/mbrtoc8.3: revision 1.6
lib/libc/locale/mbrtoc8.c: revision 1.7
lib/libc/locale/mbrtoc8.3: revision 1.7
lib/libc/locale/mbrtoc8.c: revision 1.8
lib/libc/locale/c32rtomb.3: revision 1.1
lib/libc/locale/c32rtomb.c: revision 1.1
lib/libc/locale/c32rtomb.3: revision 1.2
lib/libc/locale/c32rtomb.c: revision 1.2
lib/libc/locale/c32rtomb.3: revision 1.3
lib/libc/locale/c32rtomb.c: revision 1.3
lib/libc/locale/c32rtomb.3: revision 1.4
lib/libc/locale/c32rtomb.c: revision 1.4
lib/libc/locale/c32rtomb.3: revision 1.5
lib/libc/locale/c32rtomb.c: revision 1.5
lib/libc/locale/c32rtomb.3: revision 1.6
lib/libc/locale/c32rtomb.c: revision 1.6
lib/libc/locale/c32rtomb.3: revision 1.7
lib/libc/locale/c32rtomb.3: revision 1.8

(all via patch)


tests/lib/libc/locale/Makefile: Sort.
No functional change intended.
Preparation for PR lib/52374.

uchar.h: New header file for C11 (and C++11) compliance.

Implementation of the new functions mbrtoc16, c16rtomb, mbrtoc32, and
c32rtomb to come later. Updates for C23 to come later.
PR lib/52374: <uchar.h> missing

libc: New C11 functions mbrtoc16, mbrtoc32, c16rtomb, c32rtomb.

The mbrtoc16/32 functions read mulitbyte strings according to the
current locale into UTF-16/32 code unit sequences; the c16/32rtomb
functions write UTF-16/32 code unit sequences into multibyte strings
according to the current locale. The `r' means restartable: they
work incrementally and pick up where they left off.

NOTE: This bumps the libc minor version, since it adds new symbols.

PR lib/52374: <uchar.h> missing
mbrtoc16(3), mbrtoc32(3): Fix \n in man page examples.
Need to write \en to pacify roff.
PR lib/52374: <uchar.h> missing

c16rtomb(3), c32rtomb(3): Fix more \n in man pages.
Also, tighten an assertion: we left room for a NUL byte at the end.
PR lib/52374: <uchar.h> missing

libc: Use the more idiomatic alignof from stdalign.h.
No functional change intended.
PR lib/52374: <uchar.h> missing

mbrtoc16(3): Simplify surrogate state test.

Turn the finer-grained test into an assertion.
No semantic change intended: we are supposed to control this state,
and we always arrange it this way. (But in principle this could
change the behaviour of buggy programs that violate the mbstate_t
abstraction.)
PR lib/52374: <uchar.h> missing

libc: New functions c8rtomb(3) and mbrtoc8(3).

New in C23, for converting from UTF-8 to locale-dependent multibyte
sequences (c8rtomb) or vice versa (mbrtoc8), along with the new type
char8_t.

Conditional on either:
- _NETBSD_SOURCE
- _ISOC23_SOURCE
- __STDC_VERSION__ >= 202311L
(Riding the libc minor bump from this morning for the UTF-16/UTF-32
versions from C11.)

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
libc: c32rtomb and mbrtoc32 are used internally, so weak-alias them.
PR lib/52374: <uchar.h> missing
c8rtomb(3), mbrtoc8(3): Use namespace.h to get private aliases.

This way applications defining the symbols c32rtomb or mbrtoc32 won't
clobber our private definitions, which are slightly more constrained
about their use of mbstate_t than is obvious from the interface
contract.

PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
mbrtoc16(3), mbrtoc32(3): brush up markup

Split long .Fn lines into Fo/Fa/Fc. Dont indent the list of return
values. Don't use artisanal -width.

Untabify code examples - indented literal displays don't have correct
tab stops consistent with tab stops in the fixed font code, so the
lines end up misaligned in the PostScript output.

c16rtomb(3), c32rtomb(3): brush up markup

mbrtoc16(3), mbrtoc32(3): Simplify return value language.
Also expand BMP only once.
PR lib/52374: <uchar.h> missing

mbrtoc16(3), mbrtoc32(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc32(3): Clarify control flow.
No need for another goto here; let's keep it clearly structured with
a single `out' label.
No functional change intended.
PR lib/52374: <uchar.h> missing

c8rtomb(3), mbrtoc8(3): brush up markup

mbrtoc8(3): Simplify return value language.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Specify what happens if ps is null.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Specify what happens when ps is null.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): No state overlap with mbrtoc8 or c8rtomb.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Work on deturgidifying prose.
Still maybe not great but at least there's less jargon in most of the
text, without really losing any content.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Work on deturgidifying prose.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3), mbrtoc32(3): Restore word accidentally removed.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Restore word accidentally removed.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c8rtomb(3): Fix possible error descriptions.
The argument c8 can't be a surrogate code point itself (they're in
the range [0xd800,0xdfff], beyond 8-bit values), but the bits of a
surrogate code point could be forced into the UTF-8 format, which is
also invalid.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

c16rtomb(3), c32rtomb(3): Attempt a deturgidification pass.
Limit the jargon around surrogates.
PR lib/52374: <uchar.h> missing

c8rtomb(3): Clarify prose and fix example in caveat.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb
c16rtomb(3), c32rtomb(3), mbrtoc16(3), mbrtoc32(3): xref c8 versions.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc16(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR lib/52374: <uchar.h> missing

mbrtoc8(3): Clarify how many bytes are consumed in special cases.
Fix overlap in RETURN VALUES section.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

pass lint, XXX see lint bug.

libc: Add _l variants of the cNrtomb and mbrtocN functions.
These accept an explicit locale parameter, rather than using the
current locale.
Visible under _NETBSD_SOURCE, not exposed otherwise.
NOTE: This adds libc symbols. Riding the libc minor bump for the
non-_l variants of these from two days ago -- hope that's not pushing
it too far.
PR lib/58613: c*rtomb, mbrtoc* should have locale-parametric _l
variants

c8rtomb(3), c16rtomb(3): Add tests for incomplete NUL termination.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3): Fix NUL handling.
PR lib/58615: incomplete c8rtomb, c16rtomb handles NUL termination
wrong

c8rtomb(3), c16rtomb(3), c32rtomb(3): Test stateful shift sequences.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Fix digit error in shift sequence test.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3): Nix __CTASSERT after case label.
I put this in to make it (machine-verifiably) clear that zeroing the
state is the same as returning to the initial conversion state, as
the standard requires, but this is causing build trouble (and will
likely cause more trouble if pulled up) because some definitions of
__CTASSERT make a declaration which is forbidden after a label, so
let's remove it.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8(3): Fix pasto in comment at top.
No functional change intended.
PR standards/58601: uchar.h C23 compliance: char8_t, mbrtoc8, c8rtomb

mbrtoc8: remove lint-specific workarounds
No binary change.

mbrtoc8: fix comments

mbrtoc16, mbrtoc32: fix comments, remove lint-specific workarounds
No binary change.
t_c8rtomb, t_c16rtomb: Simplify comment.
ESC $ B is technically rather the JIS X 0208-1983 shift sequence, but
since I don't see any way to provoke the JIS X 0208-1978 shift
sequence to come flying out of this conversion (ESC $ @), and I'm not
sure there's any difference in the interpretation, let's just say JIS
X 0208.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c32rtomb(3): Use conversion state to handle shift sequences.
For conversion of Unicode scalar values to coding systems requiring
shift sequences, such as ISO-2022-JP, _citrus_iconv_convert will
always produce:
1. a shift sequence from the initial state to some nondefault state,
like from US-ASCII to JIS X 0208
2. the encoding of the desired characater
3. a shift sequence restoring the initial state
This is unnecessary if the output is already in the state needed to
encoded the desired character. For example, this method produces
seven bytes to encode each YEN SIGN in ISO-2022-JP -- and fourteen,
to encode two consecutive ones -- even though the shift sequence is
only three bytes long and once shifted YEN SIGN takes only one byte.
Instead, convert the Unicode scalar value to a locale-dependent wide
character and encode that, by composing
- _citrus_iconv_convert
=> gives us a multibyte encoding of the character from the initial
state (and restoring the initial state afterward)
- mbrtowc with initial conversion state
=> gives us the single wide character representation
XXX If combining characters are possible here, this may fail.
- wcrtomb with caller's conversion tsate
=> gives us a state-dependent multibyte encoding of the character
XXX Is there a cheaper way to convert from Unicode scalar value to
locale-dependent wide character? It is not obvious to me from the
largely undocumented Citrus machinery, but it would obviously be
better than this somewhat circuitous Rube Goldberg contraption of
chained multibyte APIs.
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

mbrtoc8(3), mbrtoc16(3): Test consuming shift sequences with state.
This has the side effect of testing mbrtoc32(3) because they are both
defined in terms of it.
PR lib/58618: mbrtocN(3) fails to keep shift state

c8rtomb(3), c16rtomb(3), c32rtomb(3): Suggest MB_LEN_MAX in example.
This way it avoids variable-length arrays, by always allocating the
maximum space that could be occupied by MB_CUR_MAX.

mbrtoc32(3): Use conversion state to handle shift sequences.
PR lib/58618: mbrtocN(3) fails to keep shift state

mbrtoc32(3): Fix name and type of mbrtowc_l return value.
This was from `int mbtowc_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to mbrtowc_l. Caught by
lint.
`mb_len' avoids (harmless) clash with standard C function mblen(3).
PR lib/58618: mbrtocN(3) fails to keep shift state

c32rtomb(3): Fix type of wcrtomb_l return value.
This was from `int wctomb_l(...)' in an earlier draft and I didn't
update it to size_t when I changed the draft to wcrtomb_l. Caught by
lint.
`wc_len' mirrors `mb_len' in the complementary code in mbrtoc32(3) to
avoid clash with standard C function mblen(3).
PR lib/58612: c8rtomb/c16rtomb/c32rtomb yield suboptimal shift
sequences

c8rtomb(3), c16rtomb(3), c32rtomb(3): Attempt to simplify language.

c8rtomb(3), c16rtomb(3), c32rtomb(3): Fix null string output case.
This ignores c8/c16/c32, produces no output anywhere, and just resets
ps to the initial conversion state.
Also just use 0 in the example, not '\0' or L'\0'. This works for
C11, which prefers '\0' and L'\0', for and C23, which introduced the
new u8'\0', u'\0' (UTF-16), and U'\0' (UTF-32).
c16rtomb, c32rtomb, mbrtoc8: fix page numbers in comments
mbrtoc8(3), mbrtoc16(3), mbrtoc32(3): Say 0 for zero code unit.
Rather than deal with differences between C11 and C23 in notation,
'\0' vs L'\0' vs u8'\0' vs u'\0' vs U'\0'.
uchar.h: Include <sys/featuretest.h> before testing _*_SOURCE.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

uchar.h: Need <sys/cdefs.h> for __restrict.
PR lib/52374: <uchar.h> missing

uchar.h: Simplify __cpp_char8_t and __cplusplus conditionals.
No functional change intended.
PR lib/52374: <uchar.h> missing

tests/lib/libc/locale/t_uchar: Test for char8_t, mbrtoc8, c8rtomb.
PR lib/58752: various header files test _*_SOURCE macros but don't
include sys/featuretest.h
PR lib/52374: <uchar.h> missing

tests/t_uchar: fix copy-and-paste typo
 1.3.2.1 14-Oct-2024  martin file t_uchar.c was added on branch netbsd-10 on 2024-10-14 17:20:19 +0000
 1.1 14-Jul-2017  perseant branches: 1.1.2;
Add a simple collation test. This test is expected to fail on HEAD since
we do not yet have a working implementation of wcscoll.
 1.1.2.1 14-Jul-2017  perseant Initial commit of a mostly-working implementation of __STDC_ISO_10646__,
with collation support using the Unicode Collation Algorithm.

The conversion from men/ku/ten form to Unicode is a gross hack at present.
Fixing this, and fleshing out the LC_COLLATE locale component, are next
on the agenda.
 1.1 21-Nov-2011  joerg branches: 1.1.4;
Add test cases for strcspn, strpbrk, strspn, wcscspn, wcspbrk and
wcsspn.
 1.1.4.2 17-Apr-2012  yamt sync with head
 1.1.4.1 21-Nov-2011  yamt file t_wcscspn.c was added on branch yamt-pagecache on 2012-04-17 00:09:11 +0000
 1.1 21-Nov-2011  joerg branches: 1.1.4;
Add test cases for strcspn, strpbrk, strspn, wcscspn, wcspbrk and
wcsspn.
 1.1.4.2 17-Apr-2012  yamt sync with head
 1.1.4.1 21-Nov-2011  yamt file t_wcspbrk.c was added on branch yamt-pagecache on 2012-04-17 00:09:11 +0000
 1.1 28-Jul-2019  christos branches: 1.1.6;
PR/54414: Valery Ushakov: add a test for wcsrtombs(3) doesn't update the
source argument on conversion error
 1.1.6.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.1.6.1 28-Jul-2019  martin file t_wcsrtombs.c was added on branch phil-wifi on 2020-04-13 08:05:26 +0000
 1.1 21-Nov-2011  joerg branches: 1.1.4;
Add test cases for strcspn, strpbrk, strspn, wcscspn, wcspbrk and
wcsspn.
 1.1.4.2 17-Apr-2012  yamt sync with head
 1.1.4.1 21-Nov-2011  yamt file t_wcsspn.c was added on branch yamt-pagecache on 2012-04-17 00:09:11 +0000
 1.5 14-Jul-2017  joerg VAX doesn't have the test cases, so stub the body as well.
 1.4 12-Jul-2017  perseant Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.
 1.3 01-Oct-2011  christos branches: 1.3.34;
Undo previous, Checking for vax is more appropriate.
 1.2 01-Oct-2011  christos no more ifdef vax
 1.1 09-Apr-2011  pgoyette atf-ify the various locale tests
 1.3.34.2 18-Mar-2018  martin Additionally pull up r1.5 for ticket #608:

VAX doesn't have the test cases, so stub the body as well.
 1.3.34.1 15-Mar-2018  martin Pull up following revision(s) (requested by maya in ticket #608):
tests/lib/libc/locale/t_sprintf.c: revision 1.3
tests/lib/libc/locale/t_wctomb.c: revision 1.5
tests/lib/libc/locale/t_io.c: revision 1.5
tests/lib/libc/locale/t_wcstod.c: revision 1.4
tests/lib/libc/locale/t_mbstowcs.c: revision 1.2
tests/lib/libc/locale/t_wctype.c: revision 1.2
tests/lib/libc/locale/t_mbrtowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.3
Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.

Separate the C/POSIX locale test from the rest; make it more thorough
and more correct. This fixes a problem reported by martin@ when the
test is compiled with -funsigned-char.
 1.5 12-Jul-2017  perseant Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.
 1.4 25-May-2017  perseant branches: 1.4.2;
Add a member to the test data structure that indicates whether the given
encoding is state-dependent, and test the results of wctomb(NULL, '\0') and
mbtowc(NULL, NULL, 0) against this instead of against each other.
 1.3 25-Mar-2013  gson Don't size an array using MB_CUR_MAX while one locale is in effect and
then use it with another locale having a larger MB_CUR_MAX. This
should fix the t_wctomb:wcrtomb_state test failures seen on i386.
 1.2 11-Jun-2011  christos branches: 1.2.2; 1.2.8;
Turn warns on for all tests and fix all the bugs.
 1.1 09-Apr-2011  pgoyette branches: 1.1.2;
atf-ify the various locale tests
 1.1.2.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.2.8.1 23-Jun-2013  tls resync from head
 1.2.2.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.4.2.1 15-Mar-2018  martin Pull up following revision(s) (requested by maya in ticket #608):
tests/lib/libc/locale/t_sprintf.c: revision 1.3
tests/lib/libc/locale/t_wctomb.c: revision 1.5
tests/lib/libc/locale/t_io.c: revision 1.5
tests/lib/libc/locale/t_wcstod.c: revision 1.4
tests/lib/libc/locale/t_mbstowcs.c: revision 1.2
tests/lib/libc/locale/t_wctype.c: revision 1.2
tests/lib/libc/locale/t_mbrtowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.3
Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.

Separate the C/POSIX locale test from the rest; make it more thorough
and more correct. This fixes a problem reported by martin@ when the
test is compiled with -funsigned-char.
 1.3 24-May-2022  andvar fix various typos in comment, documentation and log messages.
 1.2 12-Jul-2017  perseant Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.
 1.1 30-May-2017  perseant branches: 1.1.2;
Add test cases for sprintf/sscanf/strto{d,l} and the is* and isw* ctype functions, for single-byte encodings
 1.1.2.1 15-Mar-2018  martin Pull up following revision(s) (requested by maya in ticket #608):
tests/lib/libc/locale/t_sprintf.c: revision 1.3
tests/lib/libc/locale/t_wctomb.c: revision 1.5
tests/lib/libc/locale/t_io.c: revision 1.5
tests/lib/libc/locale/t_wcstod.c: revision 1.4
tests/lib/libc/locale/t_mbstowcs.c: revision 1.2
tests/lib/libc/locale/t_wctype.c: revision 1.2
tests/lib/libc/locale/t_mbrtowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.2
tests/lib/libc/locale/t_btowc.c: revision 1.3
Add ISO10646 versions of these tests, conditional on __STDC_ISO_10646__ .
Also make the tests a bit more verbose, to aid debugging when they fail.

Separate the C/POSIX locale test from the rest; make it more thorough
and more correct. This fixes a problem reported by martin@ when the
test is compiled with -funsigned-char.

RSS XML Feed