History log of /src/usr.bin/printf/printf.c |
Revision | | Date | Author | Comments |
1.59 |
| 24-Nov-2024 |
kre | Improve detection and diagnosis of invalid values for conversions. (In particular, integer conversions contain no spaces, and must always contain at least 1 digit, '' is not valid).
|
1.58 |
| 07-Aug-2024 |
kre | Correctly handle extracting wide chars from empty strings.
Fix a (probably would have rarely been seen) bug I installed yesterday.
It turns out that mbtowc() needs to include the terminating \0 in the length arg passed to it, or it errors (EILSEQ) on a zero length (instead of doing the sane thing and treating that the same as "\0" (treated as being length 1). So, increase the length passed to mbtowc() by 1. That makes no difference in the typical case, it is an upper limit on the number of bytes to examine, and mbtowc() stops after it has converted 1 character, so in the non "" input cases, nothing that matters changes.
The rest of this you can skip if you like, not directly related to this change...
Note: it is not clear to me what is correct here, POSIX looks to be ambiguous, or strange anyway; in the RETURN VALUE section it says:
If s is not a null pointer, mbtowc() shall either return 0 (if s points to the null byte), or return the number of bytes [...]
Further for the error possibilities it says:
[EILSEQ] An invalid character sequence is detected. In the POSIX locale an [EILSEQ] error cannot occur since all byte values are valid characters.
On the other hand our mbtowc(3) says:
There are special cases:
n == 0 In this case, the first n bytes of the array pointed to by s never form a complete character. Thus, the mbtowc() always fails.
Since EILSEQ is the only defined error for mbtowc() in POSIX, and cannot happen (according to it) in the POSIX locale, that "always fails" in our manual page looks dubious.
What actually happens in our mbtowc() in the POSIX locale, is that if passed n==0 (and *s == '\0') mbtowc() returns 0 (that's good) but also sets errno to EILSEQ (not so good - though this is not one of the functions guaranteed to not alter errno if it doesn't fail).
In other locales it returns -1 (with errno == EILSEQ) when n == 0. (Well, in some other locales anyway, I didn't go and test all of them).
Where POSIX gets weird, is that earlier it says:
At most n bytes of the array pointed to by s shall be examined.
If n == 0, then no bytes can be examined. In that case mbtowc() cannot test whether s points to the null byte, even in the POSIX locale.
So it is unclear (to me) what should be returned in that case.
|
1.57 |
| 06-Aug-2024 |
kre | Add %C format conversion and -L option to printf(1)
%C does what everyone always thought %c should do, but doesn't, and operates rather like the %c conversion in printf(3) (to be more precise, like %lc). It takes a code point integer value in the current locale's LC_CTYPE and prints the character designated.
-L (this printf's first, and only, option) makes the floating conversions use long double instead of double.
In the manual (printf.1) document both of those, and also be more precise as to when things are affecting bytes, and when they're manipulating characters (which makes no difference if LC_ALL=C).
|
1.56 |
| 06-Aug-2024 |
kre | PR bin/58534 -- printf(1) source code comment fix
Update the comment near the start of main() in printf.c so it explains what is really happening and why, rather than being a whole bunch of incorrect BS about what posix does or doesn't require.
This changes comments only, NFC (should be no binary change at all).
|
1.55 |
| 18-Jul-2024 |
wiz | Fix typo in comment.
Reported by Emanuele Torre in PR 58439.
|
1.54 |
| 20-May-2021 |
christos | fix typo
|
1.53 |
| 19-May-2021 |
kre | Changes for POSIX conformance.
1. exit(1) with an error message on stderr if an I/O error occurs. 1a. To work properly when built into /bin/sh sprinkle clearerr() at appropriate places.
2. Verify that when a 'X data value is used with one of the numeric conversions, that nothing follows the 'X'. It used to be unclear in the standard whether this was required or not, it is clear that with numeric conversions the entire data value must be used, or an error must result. But with string conversions, that isn't the case and unused parts are simply ignored. This one is a numeric conversion with a string value, so which applies? The standard used to contain an example of '+3 being converted, producing the same as '+ ignoring the '3' with no mention of any error, so that's the approach we adopted, The forthcoming version now explicitly states that an error would also be generated from that case, as the '3' was not used by the numeric conversion.
2a. We support those conversions with floating as well as integer conversions, as the standard used to suggest that was required (but it makes no sense, the values are always integers, printing them in a floating format is dumb). The standard has been revised to make it clear that only the integer numeric conversions %d %u %x (etc) are supposed to handle the 'X form of data value. We still allow it with the floating formats as an extension, for backward compat, just in case someone (other than the ATF tests) is using it. It might go away.
2b. These formats are sypposed to convert 'X where 'X' is a character (perhaps multibyte encoded) in the current LC_CTYPE locale category. We don't handle that, only 1 byte characters are handled currently. However the framework is now there to allow code to (one hopes, easily) be added to handle multi-byte locales. (Note that for the purposes of #2 above, 'X' must be a single character, not a single byte.)
|
1.52 |
| 16-Apr-2021 |
christos | branches: 1.52.2; make value an int to avoid all the casts and conversion warnings.
|
1.51 |
| 16-Apr-2021 |
christos | Change octal and hex parsing to not use strtoul so that they don't handle '-'. From Martijn van Duren. Also add a warning if the conversion fails (like the gnu printf does)
|
1.50 |
| 22-Jul-2019 |
kre | Amend the previous change: we can have (almost) the best of both worlds, as when the first arg (which should be the format) contains no % conversions, and there are more args, the results are unspecified (according to POSIX).
We can use this so the previous usage printf -- format arg... (which is stupid, and pointless, but used to work) continues to simply ignore the -- (unspecified results mean we can do whatever feels good...)
This brings back the #if 0'd block from the previous modification (so there is no longer anything that needs cleaning up later) but runs the getopt() loop it contained only when there are at least 2 args (so any 1 arg printf always uses that arg as the format string, whatever it contains, including just "--") and also only when the first (format) arg contains no '%' characters (which guarantees no % conversions without needing to actually parse the arg). This is the (or a) "unspecified results" case from POSIX, so we are free to do anything we like - including assuming that we might have options (we don't) and pretending to process them.
|
1.49 |
| 21-Jul-2019 |
kre | Stop assuming that printf handles options in any way at all (it doesn't - that is, shouldn't) which includes processing -- as an "end of options". The first arg is (always) the format string.
Remove call to getopt() (but still do associated changes to argc/argv)
Note: for now this is #if 0's out instead of being deleted, the old code should be fully removed sometime soon.
Problem pointed out on tech-userlevel by Thierry Laronde.
|
1.48 |
| 27-Jan-2019 |
kre | Revert previous, it was based upon a misreading of the POSIX spec. POSIX requires "as if by calling strtod()" which we did already ... by calling strtod(). Go back to doing that.
|
1.47 |
| 26-Jan-2019 |
kre | Always convert input numbers (from the command line) in the C locale, not as set in the environment. Conforms with POSIX spec.
|
1.46 |
| 10-Sep-2018 |
kre | A truly ancient bug found by Edgar Fuss
When printf is running builtin in a sh, global vars aren't reset to 0 between invocations. This affects "rval" which remembers state from a previous %b \c and thereafter always exits after the first format conversion, until we get a conversion that generates an error (which resets the flag almost by accident)
printf %b abc\\c abc (no \n) printf %s%s hello world hello (no \n, of course, no world ...) printf %s%s hello world hello printf %s%s hello world hello printf %d hello printf: hello: expected numeric value 0 (no \n) printf %s%s hello world helloworld (no \n, and we are back!)
This affects both /bin/sh and /bin/csh (and has for a very long time).
XXX pullup -8
|
1.45 |
| 04-Sep-2018 |
kre | Printf's that support \e for escape all seem to also support \E. Except us. Now we do as well.
|
1.44 |
| 03-Sep-2018 |
kre | Tighten syntax a little (no more %*4.*2d nonsense). Include the format collected so far in "missing format char" err message. Minor KNF and whitespace.
|
1.43 |
| 31-Aug-2018 |
kre | PR standards/53563
POSIX requires that signed numbers (strings preceded by '+' or '-') be allowed as inputs to all of the integer format conversions, including those which treat the data as unsigned.
Hence we do not need a variant function whose only difference from its companion is to reject strings starting with '-' - instead we use the primary function (getintmax()) for everything and remove getuintmax().
Minor update to the man page to indicate that the arg to all of the integer conversions (diouxX) must be an integer constant (with an optional sign) and to make it blatantly clear that %o is octal and %u is unsigned decimal (for some reason those weren't explicitly stated unlike d i x and X). Delete "respectively", it is not needed (and does not really apply).
XXX pullup -8
|
1.42 |
| 25-Jul-2018 |
kre | NFC: More KNF (remove () around returned constants).
|
1.41 |
| 25-Jul-2018 |
kre | NFC: whitespace & KNF.
|
1.40 |
| 24-Jul-2018 |
kre | Add support for F a and A formats (which go with the eEfgG formats already supported.)
|
1.39 |
| 03-Jul-2018 |
kre | Avoid printing error messages twice when an invalid escape sequence (\ sequence) is present in an arg to a %b conversion.
|
1.38 |
| 03-Jul-2018 |
kre | From leot@ on tech-userlevel:
Avoid running off into oblivion when a format string, or arg to a %b conversion ends in an unescaped backslash.
Patch from Leo slightly modified by me.
|
1.37 |
| 16-Jun-2015 |
christos | branches: 1.37.8; 1.37.14; 1.37.16; fix some error handling.
|
1.36 |
| 16-Jul-2013 |
christos | branches: 1.36.6; 1.36.8; 1.36.12; WARNS=6
|
1.35 |
| 15-Mar-2011 |
christos | branches: 1.35.4; 1.35.10; support grouping format.
|
1.34 |
| 13-Oct-2009 |
christos | Avoid segv on "printf '%*********s' 666", from Maksymilian Arciemowicz
|
1.33 |
| 21-Jul-2008 |
lukem | branches: 1.33.4; 1.33.8; 1.33.10; Remove the \n and tabs from the __COPYRIGHT() strings. Tweak to use a consistent format.
|
1.32 |
| 28-Mar-2008 |
christos | branches: 1.32.4; detect more errors from printf/malloc.
|
1.31 |
| 22-Mar-2005 |
dsl | Remember to consume input bytes when processing '\0nnn' for %b formats
|
1.30 |
| 30-Oct-2004 |
christos | branches: 1.30.2; - KNF, WARNS=3, pass lint. - Simplify octal parsing code.
|
1.29 |
| 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
1.28 |
| 25-Jun-2003 |
dsl | Revert previous. 'None' means that the "Utility Syntax Guidlines" apply.
|
1.27 |
| 25-Jun-2003 |
dsl | Remove getopt() loop, IEEE 1003.1 doesn't say that printf(1) should conform to the "Utility Syntax Guidlines". Fixes PR 21970.
|
1.26 |
| 24-Feb-2003 |
dsl | Fix the output of NUL bytes within %b formats. (Approved by Christos)
|
1.25 |
| 24-Nov-2002 |
christos | Fixes from David Laight: - ansification - format of output of jobs command (etc) - job identiers %+, %- etc - $? and $(...) - correct quoting of output of set, export -p and readonly -p - differentiation between nornal and 'posix special' builtins - correct behaviour (posix) for errors on builtins and special builtins - builtin printf and kill - set -o debug (if compiled with DEBUG) - cd src obj (as ksh - too useful to do without) - unset -e name, remove non-readonly variable from export list. (so I could unset -e PS1 before running the test shell...)
|
1.24 |
| 14-Jun-2002 |
tron | Complete declaration of progprintf() to fix build problem in csh(1).
|
1.23 |
| 14-Jun-2002 |
wiz | Remove #ifdef __STDC__. De-__P() and ANSIfy. Fix a prototype mismatch uncovered by this.
|
1.22 |
| 05-May-2001 |
kleink | Change to use {u,}intmax_t internally (was: (unsigned) long).
|
1.21 |
| 19-Dec-1998 |
christos | brace pollution, and char -> unsigned char
|
1.20 |
| 14-Oct-1998 |
wsanchez | include unistd
|
1.19 |
| 03-Feb-1998 |
perry | add <unistd.h> to fix compiler warning
|
1.18 |
| 19-Oct-1997 |
lukem | s/index/strchr
|
1.17 |
| 18-Oct-1997 |
mrg | "merge" lite-2. our printf is already kinda different...minor changes only.
|
1.16 |
| 04-Jul-1997 |
christos | Fix compiler warnings.
|
1.15 |
| 14-Jan-1997 |
cgd | lint and KNF changes. (mostly casting returns to void to quiet lint.)
|
1.14 |
| 09-Jan-1997 |
tls | RCS ID police
|
1.13 |
| 03-Feb-1994 |
jtc | Simplify conversion of "quoted" numeric arguments.
|
1.12 |
| 03-Feb-1994 |
jtc | Code to check if conversion (by strtol(), strtoul(), or strtod()) was identical, so I moved it into its own function.
|
1.11 |
| 03-Feb-1994 |
jtc | Add and use getulong() to handle %u, %o, %x & %X formatting directives. It was using getlong(), which caused values larger than LONG_MAX to be truncated to LONG_MAX. As recommended by 1003.2, print warning messages when argument cannot be converted to value or is out of range.
|
1.10 |
| 31-Dec-1993 |
jtc | Handle format strings error correctly.
|
1.9 |
| 25-Nov-1993 |
jtc | Error in hextobin() macro messed up hex escape constants.
|
1.8 |
| 19-Nov-1993 |
jtc | Oops! get rid of the free(), mklong()'s buffer no longer malloc()'d.
|
1.7 |
| 19-Nov-1993 |
jtc | Return from main() if a \c escape is encountered in a %b string (was an exit()). Use macro constants for "skip1" and "skip2" instead of assigning them each loop iteration. Reformat the multi-case entries in the "big switch" so the lines don't wrap.
|
1.6 |
| 19-Nov-1993 |
jtc | Move all the code from do_printf() into do-while loop in main(). I need to be able to return from main() when a "\c" in a %b string is encountered.
|
1.5 |
| 19-Nov-1993 |
jtc | Merged in most of the changes from 4.4 necessary to make printf a sh and csh builtin --- still need to handle the one remaining exit() in the SysV escape string handling code.
|
1.4 |
| 05-Nov-1993 |
jtc | Changes required to make printf utility POSIX.2 compliant: * Escape characters in the string needed to be processed as they were encountered, otherwise a "\000" octal constant would prematurely terminate the formatting string. * Implemented the %b, SysV echo(1) compatibility, formatting directive.
|
1.3 |
| 01-Aug-1993 |
mycroft | Add RCS identifiers.
|
1.2 |
| 19-Apr-1993 |
mycroft | Cleanup for GCC 2.
|
1.1 |
| 21-Mar-1993 |
cgd | branches: 1.1.1; Initial revision
|
1.1.1.2 |
| 22-Mar-1995 |
mrg | 4.4BSD-Lite2
|
1.1.1.1 |
| 21-Mar-1993 |
cgd | initial import of 386bsd-0.1 sources
|
1.30.2.1 |
| 27-Mar-2005 |
tron | Pull up revision 1.31 (requested by dsl in ticket #57): Remember to consume input bytes when processing '\0nnn' for %b formats
|
1.32.4.1 |
| 18-Sep-2008 |
wrstuden | Sync with wrstuden-revivesa-base-2.
|
1.33.10.1 |
| 21-Apr-2010 |
matt | sync to netbsd-5
|
1.33.8.1 |
| 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by christos in ticket #1091): usr.bin/printf/printf.c: revision 1.34
Avoid segv on "printf '%*********s' 666".
|
1.33.4.1 |
| 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by christos in ticket #1091): usr.bin/printf/printf.c: revision 1.34
Avoid segv on "printf '%*********s' 666".
|
1.35.10.1 |
| 20-Aug-2014 |
tls | Rebase to HEAD as of a few days ago.
|
1.35.4.1 |
| 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.36.12.1 |
| 12-Jul-2018 |
martin | Pull up following revision(s) (requested by kre in ticket #1619):
usr.bin/printf/printf.c: revision 1.37-1.39
fix some error handling.
From leot@ on tech-userlevel: Avoid running off into oblivion when a format string, or arg to a %b conversion ends in an unescaped backslash.
Patch from Leo slightly modified by me.
Avoid printing error messages twice when an invalid escape sequence (\ sequence) is present in an arg to a %b conversion.
|
1.36.8.1 |
| 12-Jul-2018 |
martin | Pull up following revision(s) (requested by kre in ticket #1619):
usr.bin/printf/printf.c: revision 1.37-1.39
fix some error handling.
From leot@ on tech-userlevel: Avoid running off into oblivion when a format string, or arg to a %b conversion ends in an unescaped backslash.
Patch from Leo slightly modified by me.
Avoid printing error messages twice when an invalid escape sequence (\ sequence) is present in an arg to a %b conversion.
|
1.36.6.1 |
| 12-Jul-2018 |
martin | Pull up following revision(s) (requested by kre in ticket #1619):
usr.bin/printf/printf.c: revision 1.37-1.39
fix some error handling.
From leot@ on tech-userlevel: Avoid running off into oblivion when a format string, or arg to a %b conversion ends in an unescaped backslash.
Patch from Leo slightly modified by me.
Avoid printing error messages twice when an invalid escape sequence (\ sequence) is present in an arg to a %b conversion.
|
1.37.16.2 |
| 13-Apr-2020 |
martin | Mostly merge changes from HEAD upto 20200411
|
1.37.16.1 |
| 10-Jun-2019 |
christos | Sync with HEAD
|
1.37.14.4 |
| 26-Jan-2019 |
pgoyette | Sync with HEAD
|
1.37.14.3 |
| 30-Sep-2018 |
pgoyette | Ssync with HEAD
|
1.37.14.2 |
| 06-Sep-2018 |
pgoyette | Sync with HEAD
Resolve a couple of conflicts (result of the uimin/uimax changes)
|
1.37.14.1 |
| 28-Jul-2018 |
pgoyette | Sync with HEAD
|
1.37.8.3 |
| 23-Sep-2018 |
martin | Pull up following revision(s) (requested by kre in ticket #1020): usr.bin/printf/printf.c: revision 1.46
A truly ancient bug found by Edgar Fuss
When printf is running builtin in a sh, global vars aren't reset to 0 between invocations. This affects "rval" which remembers state from a previous %b \c and thereafter always exits after the first format conversion, until we get a conversion that generates an error (which resets the flag almost by accident)
printf %b abc\\c abc (no \n) printf %s%s hello world hello (no \n, of course, no world ...) printf %s%s hello world hello printf %s%s hello world hello printf %d hello printf: hello: expected numeric value 0 (no \n) printf %s%s hello world helloworld (no \n, and we are back!)
This affects both /bin/sh and /bin/csh (and has for a very long time).
XXX pullup -8
|
1.37.8.2 |
| 01-Sep-2018 |
martin | Pull up following revision(s) (requested by kre in ticket #1002):
usr.bin/printf/printf.1: revision 1.31 (via patch) usr.bin/printf/printf.c: revision 1.43
PR standards/53563
POSIX requires that signed numbers (strings preceded by '+' or '-') be allowed as inputs to all of the integer format conversions, including those which treat the data as unsigned.
Hence we do not need a variant function whose only difference from its companion is to reject strings starting with '-' - instead we use the primary function (getintmax()) for everything and remove getuintmax().
Minor update to the man page to indicate that the arg to all of the integer conversions (diouxX) must be an integer constant (with an optional sign) and to make it blatantly clear that %o is octal and %u is unsigned decimal (for some reason those weren't explicitly stated unlike d i x and X). Delete "respectively", it is not needed (and does not really apply).
XXX pullup -8
|
1.37.8.1 |
| 13-Jul-2018 |
martin | Pull up following revision(s) (requested by kre in ticket #914):
usr.bin/printf/printf.c: revision 1.38,1.39
From leot@ on tech-userlevel: Avoid running off into oblivion when a format string, or arg to a %b conversion ends in an unescaped backslash.
Patch from Leo slightly modified by me.
Avoid printing error messages twice when an invalid escape sequence (\ sequence) is present in an arg to a %b conversion.
|
1.52.2.1 |
| 31-May-2021 |
cjep | sync with head
|