Home | History | Annotate | Download | only in gen
History log of /src/lib/libc/gen/ctype_.c
RevisionDateAuthorComments
 1.24  15-Sep-2025  riastradh ctype(3): New environment variable LIBC_ALLOWCTYPEABUSE.

If set, this does not force the ctype(3) functions to crash when
passed invalid inputs -- instead, they will return nonsense results,
and possibly print warnings to stderr, as is their right in
implementing undefined behaviour.

The nature of the nonsense results is unspecified. Currently, is*()
will always return true (even if that leads to mutually contradictory
conclusions, like isalpha and isdigit, or isgraph and isblank), and
tolower/toupper() will always return EOF. But perhaps in the future
the results may be randomized.

This way, if an application like firefox crashes on ctype abuse, you
can opt to accept the consequences of nonsense results instead by
running `env LIBC_ALLOWCTYPEABUSE= firefox' until the application is
fixed.

PR lib/58208: ctype(3) provides poor runtime feedback of abuse
 1.23  30-Mar-2025  riastradh branches: 1.23.2;
ctype(3): Simplify definitions of ctype/tolower/toupper tables.

Clarify comment while here.

No functional change intended. No change to `readelf -a' output on
amd64 or aarch64.

PR lib/58208: ctype(3) provides poor runtime feedback of abuse
 1.22  29-Mar-2025  riastradh libc: Restore ELF symbol sizes for _C_ctype_tab_ &c.

This is needed for dynamic position-dependent executables that refer
directly to _C_type_tab_ to get correct copy relocations to see the
table content.

Unfortunately, such executables won't get a guard page.

Fortunately, referring to _C_ctype_tab_ directly (and not the
indirection _ctype_tab_ as the ctype(3) macros do) is very weird and
unlikely to happen in the real world (none of the public interfaces
use it; it is exported for libc++.so/libstdc++.so to use, but those
aren't pies). So missing the guard page in this case is probably not
so bad.

The symbol sizes are also needed for, e.g., gdb to nicely identify
addresses that lie in the table.

PR lib/58208: ctype(3) provides poor runtime feedback of abuse
 1.21  29-Mar-2025  riastradh ctype(3): Put guard pages before the C ctype/tolower/toupper tables.

This also only affects machines where char is signed for now. (But
maybe it would be worth doing unconditionally; users could still try
to pass in explicit `signed char' inputs.)

PR lib/58208: ctype(3) provides poor runtime feedback of abuse
 1.20  13-Apr-2013  joerg branches: 1.20.40;
Extend ctype classification table to 16bit. Based on patch by
Takehiko Nozaki, with changes to compile fail when using the old names
and to exploit __BUILD_LEGACY
 1.19  14-Dec-2010  joerg branches: 1.19.6; 1.19.12;
Prefix ctype bitmask macros with _CTYPE
 1.18  01-Jun-2010  tnozaki more split ctype.h -> sys/ctype_inline.h, sys/ctype_bits.h
 1.17  22-May-2010  tnozaki 1. hide _CTYPE_PRIVATE section in ctype.h, move them to private header ctype_local.h.
2. do not use _CTYPE_NUM_CHARS macro to read data from LC_CTYPE(old BSDCTYPE style) database.
because 1<<CHAR_BIT is MD, so i added MI macro _CTYPE_CACHE_SIZE(1<<8).
3. remove _NB_CACHED_RUNE macro, use _CTYPE_CACHE_SIZE instead.
 1.16  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22280, verified by myself.
 1.15  17-Apr-2001  kleink Need <limits.h> in _CTYPE_PRIVATE environments.
 1.14  13-Jul-1997  christos branches: 1.14.14;
Fix RCSID's
 1.13  02-Jun-1997  kleink Add support for localized character sets (a.k.a. LC_CTYPE).

Thanks go to Matthias Scheler <tron@lyssa.owl.de> for contributing his initial
work in PR/3592, and to Christos Zoulas for refining it!
 1.12  25-Feb-1995  cgd merge with Lite, keep local changes. clean up id usage
 1.11  17-May-1994  cgd copyright foo
 1.10  05-Oct-1993  jtc Due to an 8-bit attribute table and 9 bits of attributes, I've had to
remove the _B attribute from the "horizontal tab" position, and change
the isblank function to explicitly test against space and tab.

When I finish merging the 4.4 runes code, this table will have to grow
to 16 bit entries, as several more attributes have been introduced.
I'm making this change so existing libraries will continue working
for the next (little) while.
 1.9  14-Sep-1993  jtc Both space and tab are in the blank character class.
 1.8  26-Aug-1993  jtc Declare rcsid strings so they are stored in text segment.
 1.7  23-Aug-1993  jtc Moved toupper and tolower tables from ctype_.c to their own files --- I
received complaints about using shorts in the table (but i need a range
of -1..255), so now the tables will not be used unless either toupper()
or tolower() (and soon, setlocale()) are used. This can save up to 514
bytes.

In toupper_.c and tolower_.c make sure that our assumption of EOF == -1
holds.

Fixed bug where _toupper_tab_ was initialized pointing to _C_tolower_tab.
 1.6  21-Aug-1993  jtc _ctype_, _tolower_tab_, and _toupper_tab_ are now pointers to the tables.
The tables have been renamed to _C_ctype_, _C_tolower_, and _C_toupper_
as they are tables for the C locale. When switching to a new locale, the
pointers will be set to point to tables specific to the new locale.
 1.5  09-Aug-1993  jtc Oops! I used EOF but didn't include <stdio.h>.
 1.4  06-Aug-1993  jtc Added C locale specific translation tables for toupper and tolower. When
locales are fully supported, toupper and tolower will refer to this, or
a locale specific table, through pointers.
 1.3  06-Aug-1993  jtc Use const qualifier with _ctype_ table. Smart compilers can then store it
in the text segment. When we implement locales, the isctype macros/functions
will reference this table (or a locale specific table) through a pointer, but
for right now, it continues to reference the _ctype_ table directly.
 1.2  30-Jul-1993  mycroft Add even more RCS frobs.
 1.1  21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.1  21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.14.14.1  08-Oct-2001  nathanw Catch up to -current.
 1.19.12.1  23-Jun-2013  tls resync from head
 1.19.6.1  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.20.40.1  02-Aug-2025  perseant Sync with HEAD
 1.23.2.1  01-Oct-2025  martin Pull up following revision(s) (requested by riastradh in ticket #45):

tests/lib/libc/gen/t_ctype.c: revision 1.12
lib/libc/gen/ctype_.c: revision 1.24
lib/libc/locale/rune.c: revision 1.50
tests/lib/libc/gen/Makefile: revision 1.61
lib/libc/gen/tolower_.c: revision 1.18
lib/libc/gen/isctype.c: revision 1.29
distrib/sets/lists/tests/mi: revision 1.1394
lib/libc/gen/toupper_.c: revision 1.18
lib/libc/gen/ctype_guard.h: revision 1.8
lib/libc/locale/Makefile.inc: revision 1.69
lib/libc/gen/ctype.3: revision 1.32
lib/libc/gen/ctype.3: revision 1.33
distrib/sets/lists/debug/mi: revision 1.486
tests/lib/libc/gen/h_ctype_abuse.c: revision 1.1
tests/lib/libc/gen/h_ctype_abuse.c: revision 1.2

ctype(3): New environment variable LIBC_ALLOWCTYPEABUSE.

If set, this does not force the ctype(3) functions to crash when
passed invalid inputs -- instead, they will return nonsense results,
and possibly print warnings to stderr, as is their right in
implementing undefined behaviour.

The nature of the nonsense results is unspecified. Currently, is*()
will always return true (even if that leads to mutually contradictory
conclusions, like isalpha and isdigit, or isgraph and isblank), and
tolower/toupper() will always return EOF. But perhaps in the future
the results may be randomized.

This way, if an application like firefox crashes on ctype abuse, you
can opt to accept the consequences of nonsense results instead by
running `env LIBC_ALLOWCTYPEABUSE= firefox' until the application is
fixed.

PR lib/58208: ctype(3) provides poor runtime feedback of abuse
ctype(3): Document LIBC_ALLOWCTYPEABUSE.

If this is pulled up to netbsd-11, we should tweak the text to make
it apply to 11 too.
PR lib/58208: ctype(3) provides poor runtime feedback of abuse

ctype(3): Fix build of tests on machines with unsigned char.
Could maybe phrase this better but this'll do for now.

PR lib/58208: ctype(3) provides poor runtime feedback of abuse

RSS XML Feed