Home | History | Annotate | Download | only in sh
History log of /src/bin/sh/miscbltin.c
RevisionDateAuthorComments
 1.57  03-Jul-2025  kre Don't allow read to make use of the shell's internal '='
terminates var names feature (which exists so in things
like "external foo=bar" the shell can simply set the "variable"
"foo=bar" to "bar" and doesn't need to put \0 on top of the '=',
or copy the var name part elsewhere, and other similar internal
advantages) - in most cases either allowing the '=' is intended, (as
in the export example) or other checks make it impossible (${var} etc),
but nothing was checking the var names passed to the read command.

Fix that ... (side effect is that now if an invalid name is
given, it will be detected before anything is read, before a
prompt is written, rather than after the read, when the vars
are being set to the fields from the line read).

Don't bother doing this in SMALL shells, avoid the (small) extra
code bloat - SMALL shells can just treat being able to say
read a b=hello c (which means the same as read a b c)
as a harmless foible...
 1.56  12-Oct-2024  kre Undo an idiotic attempted micro optimisation in the previous
version which broke things...
 1.55  11-Oct-2024  kre Add -b and -nMAX options to the read builtin.

As requested on (perhaps more than one) mailing list, this adds
a -n MAX option, to allow the amount of data read by the read
builtin to be limited to MAX bytes (in case the record delimiter
doesn't appear in the input for a long time). There is
currently an upper bound of 8MiB on the value of MAX.

Also add a -b option, which allows for buffered input (with
some usage caveats) rather than 1 byte at a time.

Neither option exists in SMALL shells.

Note that the proposed -z option got deleted ... I couldn't
find a rational way to explain what the final state would be
if a \0 on input generated an error, so rather than have
things ambiguous, better just not to have the option, and
simply keep ignoring input \0's as always.

See the (updated) sh(1) man page for more details.

No pullups planned (new feature, only for new releases).
 1.54  05-Oct-2023  kre If the read builtin is told to read into IFS, we must avoid doing
that until all current uses of IFS are complete (as we have IFS's
value cached in ifs - if IFS alters, ifs might point anywhere).
Handle this by deferring assignments to IFS until everything is done.
This makes us appear to comply with the (currently) proposed requirement
for read by POSIX that field splitting complete before vars are
assigned. (Other shells, like dash, ksh93, yash, bosh behave like this)

That might end up being unspecified though, as other shells (bosh,
mksh) assign each field to its var as it is delimited (though bosh
appears to have bugs). If we wanted to go that route, the issue here
could have been handled by re-doing the init of ifs after every
setvar() that is performed here (except the last, after which it is
no longer needed).

XXX pullup -10
 1.53  11-Dec-2022  kre branches: 1.53.2;

It appears that POSIX intends to add a -d X option to the read command
in its next version, so it can be used as -d '' (to specify a \0 end
character for the record read, rather than the default \n) to accompany
find -print0 and xargs -0 options (also likely to be added).

Add support for -d now. While here fix a bug where escaped nul
chars (\ \0) in non-raw mode were not being dropped, as they are
when not escaped (if not dropped, they're still not used in any
useful way, they just ended the value at that point).
 1.52  19-Aug-2022  kre Don't output the error for bad usage (no var name given)
after already writing the prompt (set with the -p option).

That results in nonsense like:

$ read -p foo
fooread: arg count

While here, improve the error message so it means something.

Now we will get:

$ read -p foo
read: variable name required
Usage: read [-r] [-p prompt] var...

[Detected by code reading while doing the work for the previous fix]
 1.51  19-Aug-2022  kre PR bin/56972 Fix escape ('\') handling in sh read builtin.

In 1.35 (March 2005) (the big read fixup), most escape handling and IFS
processing in the read builtin was corrected. However 2 cases were missed,
one is a word (something to be assigned to any variable but the last) in
which every character is escaped (the code was relying on a non-escaped char
to set the "in a word" status), and second trailing IFS whitespace at
the end of the line was being deleted, even if the chars had been escaped
(the escape chars are no longer present).

See the PR for more details (including the case that detected the problem).

After fixing this, I looked at the FreeBSD code (normally might do it
before, but these fixes were trivial) to check their implementation.
Their code does similar things to ours now does, but in a completely
different way, their read builtin is more complex than ours needs to
be (they handle more options). For anyone tempted to simply incorporate
their code, note that it relies upon infrastructure changes elsewhere
in the shell, so would not be a simple cut and drop in exercise.

This needs pullups to -3 -4 -5 -6 -7 -8 and -9 (fortunately this is
happening before -10 is branched, so will never be broken this way there).
 1.50  16-Apr-2022  kre Redo the way the builtin cmd 'ulimit' getopt() (nextopt() really, but it
is essentially the same) arg string is generated, to lessen the chances
that the table of limits, and the arg string that allows limits to be
reported or set will get out of sync. They weren't (as long as we didn't
grow an RLIMIT_SWAP) this is just tidier.

While here, reorder the limits table fields, and shrink a couple that
were needlessly wasteful, to save some space -- for most architectures
this should save 8 bytes per table entry (there are currently 13).
(Some minor code bloat offsets this slightly because of int type
promotions now required).

NFCI.
 1.49  16-Apr-2022  kre While doing the previous change, I noticed that when used in a
particularly perverse way, the error message for a bad octal
constant as the new umask value could incorrectly claim that the
-S option (which would need to be present to cause this issue)
was the detected bad value. Fix that to report the actual
incorrect arg.

And while fiddling, also check for args to umask that are too big
to be sane mask values (the biggest permitted is 07777) and use
mode_t as the mask variable type, rather than int.
 1.48  16-Apr-2022  kre Avoid generating error messages implying that user errors are illegal.
 1.47  12-Dec-2021  andvar s/Miscelaneous/Miscellaneous/ and s/slahes/slashes/ in comments.
 1.46  16-Nov-2021  kre Detect write errors to stdout, and exit(1) from some built-in
commands which (primarily) are used just to generate output
(or with a particular option combination do so).
 1.45  15-Sep-2021  kre Have the ulimit command watch for ulimit -n (alter number of available fds)
and keep the rest of the shell aware of any changes.

While here, modify 'ulimit -aSH' to print both the soft and hard limits
for the resources, rather than just (in this case, as H comes last) the
hard limit. In any other case when both S and H are present, and we're
examining a limit, use the soft limit (just as if neither were given).

No change for setting limits (both are set, unless exactly one of -H
or -S is given). However, we now check for overflow when converting
the value to be assigned, rather than just truncating the value however
it happens to work out...
 1.44  13-May-2017  gson branches: 1.44.2; 1.44.10; 1.44.12;
Fix inconsistent whitespace
 1.43  09-May-2015  christos branches: 1.43.6;
CID 1225078: check getrlimit return
 1.42  11-Jun-2012  njoly Allow thread limit queries by adding the new -r flag to ulimit. Add
the corresponding documentation in the man page.
 1.41  09-Jun-2012  christos support RLIMIT_NTHR.
 1.40  11-Oct-2011  christos branches: 1.40.2;
print the flag too next to the units like bash does.
 1.39  18-Jun-2011  christos PR/45069: Henning Petersen: Use prototypes from builtins.h .
 1.38  29-Mar-2009  mrg branches: 1.38.4;
- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)
 1.37  28-Dec-2008  christos branches: 1.37.2;
rlim_t will be unsigned as TOG mandates.
 1.36  01-Oct-2005  christos branches: 1.36.26;
fix setmode error handling.
 1.35  19-Mar-2005  dsl Fix the way the 'read' builtin processes IFS. In particular:
- IFS whitespace is now processes correctly,
- Trailing non-whitespace IFS characters are added to the last variable
iff a subsequent variable would have been assigned a non-null string.
Now passes the 'read' tests in http://www.research.att.com/~gsf/public/ifs.sh
 1.34  19-Apr-2004  lukem branches: 1.34.2;
Correct the description of sbsize; it is parsed in bytes not kbytes.
 1.33  17-Apr-2004  christos understand rlimit sbsize
 1.32  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22249, verified by myself.
 1.31  24-Nov-2002  christos Fixes from David Laight:
- ansification
- format of output of jobs command (etc)
- job identiers %+, %- etc
- $? and $(...)
- correct quoting of output of set, export -p and readonly -p
- differentiation between nornal and 'posix special' builtins
- correct behaviour (posix) for errors on builtins and special builtins
- builtin printf and kill
- set -o debug (if compiled with DEBUG)
- cd src obj (as ksh - too useful to do without)
- unset -e name, remove non-readonly variable from export list.
(so I could unset -e PS1 before running the test shell...)
 1.30  04-Feb-2001  christos remove redundant declarations and nexted externs.
 1.29  04-Jan-2001  lukem use more standard %ll_ in favour of %q_
 1.28  22-Nov-2000  christos error message cleanup:
- don't print the builtin name twice
- explain why things fail
- no extra newline
 1.27  26-Sep-1998  christos include <stdlib.h> to get the prototype for free()
 1.26  24-Sep-1998  itohy The return value of setmode(3) is a pointer to malloc()'ed area
and must be freed to avoid memory leaks if called repeatedly.
The leaks occured on symbolic umask command, such as "umask go-w",
which is undocumented.
 1.25  20-May-1998  christos Cast is*() args to unsigned chars in case the ctype macros are implemented
using arrays.
 1.24  04-Feb-1998  thorpej Fix printf formats so they work on the Alpha.
 1.23  21-Jan-1998  christos BSD4_4 is a standard symbol in <sys/param.h>; make sure that files
that need this defined, include <sys/param.h> and don't define it in
the Makefile. Add a comment to that effect.
 1.22  16-Jan-1998  christos test for the boundary condition in the previous trailing blank fix
 1.21  15-Jan-1998  christos PR/4805: Ty Sarna: read builtin does not remove trailing blanks.
 1.20  05-Nov-1997  kleink Per 1003.2, the (builtin) read utility shall treat the backslash as an
escape character (including line continuation), unless the `-r' option
is specified:
* adopt to this behaviour, add the `-r' option to disable it;
* remove the `-e' option, which was previously necessary to get this behaviour.
 1.19  04-Jul-1997  christos branches: 1.19.2;
Fix compiler warnings.
 1.18  11-Apr-1997  christos Make this work on systems that don't have quads
 1.17  11-Jan-1997  tls kill 'register'
 1.16  16-Oct-1996  christos PR/2808: Remove trailing whitespace (from FreeBSD)
 1.15  12-Jun-1995  jtc branches: 1.15.6;
Changed type of rlimit values from quad_t to rlim_t. Cast rlim_t's to
quad_t's and use "%qd" in printf.
Eliminated unneccessary conditional.
 1.14  11-May-1995  christos Merge in my changes from vangogh, and fix the x=`false`; echo $? == 0
bug.
 1.13  21-Mar-1995  cgd convert to new RCS id conventions.
 1.12  05-Dec-1994  cgd clean up further. more patches from Jim Jegers
 1.11  11-Jun-1994  mycroft Add RCS ids.
 1.10  12-May-1994  jtc use prototypes provided by header files instead of our own
 1.9  12-May-1994  jtc Include appropriate header files to bring function prototypes into scope.
 1.8  11-May-1994  jtc integrate NetBSD's POSIX.2 compliant umask builtin
 1.7  11-May-1994  jtc sync with 4.4lite
 1.6  06-Apr-1994  cgd do right right thing if 'read' given no args. ptd out by Geoff Rehmet
 1.5  01-Aug-1993  mycroft Add RCS identifiers.
 1.4  21-Jul-1993  jtc Make umask builtin of shell POSIX 1003.2 compliant:
Print out a symbolic mask with the -S option; and accept symbolic mask
specifications.
 1.3  23-Mar-1993  cgd changed "Id" to "Header" for rcsids
 1.2  22-Mar-1993  cgd added rcs ids to all files
 1.1  21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.2  11-May-1994  jtc 44lite code
 1.1.1.1  21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.15.6.1  26-Jan-1997  rat Update /bin/sh from trunk per request of Christos Zoulas. Fixes
many bugs.
 1.19.2.2  08-May-1998  mycroft Sync with trunk, per request of christos.
 1.19.2.1  06-Nov-1997  mellon Pull rev 1.20 up from trunk (kleink)
 1.34.2.1  07-Apr-2005  tron Pull up revision 1.35 (requested by dsl in ticket #117):
Fix the way the 'read' builtin processes IFS. In particular:
- IFS whitespace is now processes correctly,
- Trailing non-whitespace IFS characters are added to the last variable
iff a subsequent variable would have been assigned a non-null string.
Now passes the 'read' tests in http://www.research.att.com/~gsf/public/ifs.sh
 1.36.26.1  01-Apr-2009  snj Pull up following revision(s) (requested by mrg in ticket #622):
bin/csh/csh.1: revision 1.46
bin/csh/func.c: revision 1.37
bin/ps/print.c: revision 1.111
bin/ps/ps.c: revision 1.74
bin/sh/miscbltin.c: revision 1.38
bin/sh/sh.1: revision 1.92 via patch
external/bsd/top/dist/machine/m_netbsd.c: revision 1.7
lib/libkvm/kvm_proc.c: revision 1.82
sys/arch/mips/mips/cpu_exec.c: revision 1.55
sys/compat/darwin/darwin_exec.c: revision 1.57
sys/compat/ibcs2/ibcs2_exec.c: revision 1.73
sys/compat/irix/irix_resource.c: revision 1.15
sys/compat/linux/arch/amd64/linux_exec_machdep.c: revision 1.16
sys/compat/linux/arch/i386/linux_exec_machdep.c: revision 1.12
sys/compat/linux/common/linux_limit.h: revision 1.5
sys/compat/osf1/osf1_resource.c: revision 1.14
sys/compat/svr4/svr4_resource.c: revision 1.18
sys/compat/svr4_32/svr4_32_resource.c: revision 1.17
sys/kern/exec_subr.c: revision 1.62
sys/kern/init_sysctl.c: revision 1.160
sys/kern/kern_exec.c: revision 1.288
sys/kern/kern_resource.c: revision 1.151
sys/sys/param.h: patch
sys/sys/resource.h: revision 1.31
sys/sys/sysctl.h: revision 1.184
sys/uvm/uvm_extern.h: revision 1.153
sys/uvm/uvm_glue.c: revision 1.136
sys/uvm/uvm_mmap.c: revision 1.128
usr.bin/systat/ps.c: revision 1.32
- - add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.
- - adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.
- - add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)
- - patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)
- - patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.
- - update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)
this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.
tested on i386 and sparc64, build tested on several other platforms.
thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)
 1.37.2.1  13-May-2009  jym Sync with HEAD.

Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
 1.38.4.1  23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.40.2.1  30-Oct-2012  yamt sync with head
 1.43.6.1  19-May-2017  pgoyette Resolve conflicts from previous merge (all resulting from $NetBSD
keywork expansion)
 1.44.12.1  27-Oct-2022  martin Pull up following revision(s) (requested by kre in ticket #1549):

bin/sh/miscbltin.c: revision 1.51
bin/sh/miscbltin.c: revision 1.52

PR bin/56972 Fix escape ('\') handling in sh read builtin.

In 1.35 (March 2005) (the big read fixup), most escape handling and IFS
processing in the read builtin was corrected. However 2 cases were missed,
one is a word (something to be assigned to any variable but the last) in
which every character is escaped (the code was relying on a non-escaped char
to set the "in a word" status), and second trailing IFS whitespace at
the end of the line was being deleted, even if the chars had been escaped
(the escape chars are no longer present).

See the PR for more details (including the case that detected the problem).

After fixing this, I looked at the FreeBSD code (normally might do it
before, but these fixes were trivial) to check their implementation.

Their code does similar things to ours now does, but in a completely
different way, their read builtin is more complex than ours needs to
be (they handle more options). For anyone tempted to simply incorporate
their code, note that it relies upon infrastructure changes elsewhere
in the shell, so would not be a simple cut and drop in exercise.

This needs pullups to -3 -4 -5 -6 -7 -8 and -9 (fortunately this is
happening before -10 is branched, so will never be broken this way there).

-

Don't output the error for bad usage (no var name given)
after already writing the prompt (set with the -p option).

That results in nonsense like:
$ read -p foo
fooread: arg count

While here, improve the error message so it means something.

Now we will get:
$ read -p foo
read: variable name required
Usage: read [-r] [-p prompt] var...

[Detected by code reading while doing the work for the previous fix]
 1.44.10.2  21-Apr-2020  martin Ooops, restore accidently removed files from merge mishap
 1.44.10.1  21-Apr-2020  martin Sync with HEAD
 1.44.2.1  27-Oct-2022  martin Pull up following revision(s) (requested by kre in ticket #1779):

bin/sh/miscbltin.c: revision 1.51
bin/sh/miscbltin.c: revision 1.52

PR bin/56972 Fix escape ('\') handling in sh read builtin.

In 1.35 (March 2005) (the big read fixup), most escape handling and IFS
processing in the read builtin was corrected. However 2 cases were missed,
one is a word (something to be assigned to any variable but the last) in
which every character is escaped (the code was relying on a non-escaped char
to set the "in a word" status), and second trailing IFS whitespace at
the end of the line was being deleted, even if the chars had been escaped
(the escape chars are no longer present).

See the PR for more details (including the case that detected the problem).

After fixing this, I looked at the FreeBSD code (normally might do it
before, but these fixes were trivial) to check their implementation.

Their code does similar things to ours now does, but in a completely
different way, their read builtin is more complex than ours needs to
be (they handle more options). For anyone tempted to simply incorporate
their code, note that it relies upon infrastructure changes elsewhere
in the shell, so would not be a simple cut and drop in exercise.

This needs pullups to -3 -4 -5 -6 -7 -8 and -9 (fortunately this is
happening before -10 is branched, so will never be broken this way there).

-

Don't output the error for bad usage (no var name given)
after already writing the prompt (set with the -p option).

That results in nonsense like:
$ read -p foo
fooread: arg count

While here, improve the error message so it means something.

Now we will get:
$ read -p foo
read: variable name required
Usage: read [-r] [-p prompt] var...

[Detected by code reading while doing the work for the previous fix]
 1.53.2.1  03-Nov-2023  martin Pull up following revision(s) (requested by kre in ticket #454):

bin/sh/miscbltin.c: revision 1.54

If the read builtin is told to read into IFS, we must avoid doing
that until all current uses of IFS are complete (as we have IFS's
value cached in ifs - if IFS alters, ifs might point anywhere).
Handle this by deferring assignments to IFS until everything is done.

This makes us appear to comply with the (currently) proposed requirement
for read by POSIX that field splitting complete before vars are
assigned. (Other shells, like dash, ksh93, yash, bosh behave like this)

That might end up being unspecified though, as other shells (bosh,
mksh) assign each field to its var as it is delimited (though bosh
appears to have bugs). If we wanted to go that route, the issue here
could have been handled by re-doing the init of ifs after every
setvar() that is performed here (except the last, after which it is
no longer needed).

RSS XML Feed