| History log of /src/usr.bin/sort |
| Revision | Date | Author | Comments |
| 1.12 | 03-Aug-2023 |
rin | Revert CC_WNO_USE_AFTER_FREE from Makefile's (thanks uwe@)
|
| 1.11 | 03-Aug-2023 |
rin | Sprinkle CC_WNO_USE_AFTER_FREE for GCC 12
All of them are blamed for idiom equivalent to: newbuf = realloc(buf, size); p = newbuf + (p - buf);
|
| 1.10 | 03-Jun-2023 |
lukem | bsd.own.mk: rename GCC_NO_* to CC_WNO_*
Rename compiler-warning-disable variables from GCC_NO_warning to CC_WNO_warning where warning is the full warning name as used by the compiler.
GCC_NO_IMPLICIT_FALLTHRU is CC_WNO_IMPLICIT_FALLTHROUGH
Using the convention CC_compilerflag, where compilerflag is based on the full compiler flag name.
|
| 1.9 | 13-Oct-2019 |
mrg | introduce some common variables for use in GCC warning disables:
GCC_NO_FORMAT_TRUNCATION -Wno-format-truncation (GCC 7/8) GCC_NO_STRINGOP_TRUNCATION -Wno-stringop-truncation (GCC 8) GCC_NO_STRINGOP_OVERFLOW -Wno-stringop-overflow (GCC 8) GCC_NO_CAST_FUNCTION_TYPE -Wno-cast-function-type (GCC 8)
use these to turn off warnings for most GCC-8 complaints. many of these are false positives, most of the real bugs are already commited, or are yet to come.
we plan to introduce versions of (some?) of these that use the "-Wno-error=" form, which still displays the warnings but does not make it an error, and all of the above will be re-considered as either being "fix me" (warning still displayed) or "warning is wrong."
|
| 1.8 | 10-Sep-2009 |
dsl | branches: 1.8.46; Save length of key instead of relying of the weight of the record sep. This frees a byte value to use for 'end of key' (to correctly sort short keys) while still having a weight assigned to the field sep. (Unless -t is given, the field sep is in the field data.) Do reverse sorts by writing the output file in reverse order (rather than reversing the sort - apart from merges). All key compares are now unweighted. For 'sort -u' mark duplicates keys during the sort and don't write to the output. Use -S to mean a posix sort - where equal keys are sorted using the raw record (rather than being kept in the original order). For 'sort -f' (no keys) generate a key of the folded data (as for -n -i and -d), simplifies the code and allows a 'posix' sort.
|
| 1.7 | 05-Sep-2009 |
dsl | Include a local copy of the sradixsort() code from libc. Currently unchanged apart from the deletion of the 'unstable' version and other unneeded code. Use fldtab[0]. not fldtab-> when we are referring to the global info in the 0th entry to emphasise that this entry is different. fldtab[0].weights is only needed in the SINGL_FLD case - so set it there. Re-indent a big 'if' is setfield() so that the line breaks match the logic - which looks dubious now!
|
| 1.6 | 14-Apr-2009 |
lukem | Enable WARNS=4 by default for usr.bin, except for: awk bdes checknr compile_et error gss hxtool kgetcred kinit klist ldd less lex locale login m4 man menuc mk_cmds mklocale msgc openssl rpcgen rpcinfo sdiff spell ssh string2key telnet tn3270 verify_krb5_conf xlint
|
| 1.5 | 20-Mar-2003 |
jdolecek | branches: 1.5.40; 1.5.42; 1.5.46; this builds with WARNS=2
|
| 1.4 | 19-Feb-2001 |
jdolecek | put tmp.c back to Makefile, too
|
| 1.3 | 08-Jan-2001 |
jdolecek | make ftmp() wrapper aroung tmpfile(), there is no need to reimplement it move ftmp() from tmp.c to files.c g/c no longer needed stuff
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.5.46.1 | 21-Apr-2010 |
matt | sync to netbsd-5
|
| 1.5.42.1 | 13-May-2009 |
jym | Sync with HEAD.
Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
|
| 1.5.40.1 | 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|
| 1.8.46.1 | 13-Apr-2020 |
martin | Mostly merge changes from HEAD upto 20200411
|
| 1.23 | 06-Nov-2009 |
joerg | Retire __SCCSID. It has only archeological value now. Also retire lint conditional around __RCSID, lint can handle that fine.
|
| 1.22 | 10-Sep-2009 |
dsl | Save length of key instead of relying of the weight of the record sep. This frees a byte value to use for 'end of key' (to correctly sort short keys) while still having a weight assigned to the field sep. (Unless -t is given, the field sep is in the field data.) Do reverse sorts by writing the output file in reverse order (rather than reversing the sort - apart from merges). All key compares are now unweighted. For 'sort -u' mark duplicates keys during the sort and don't write to the output. Use -S to mean a posix sort - where equal keys are sorted using the raw record (rather than being kept in the original order). For 'sort -f' (no keys) generate a key of the folded data (as for -n -i and -d), simplifies the code and allows a 'posix' sort.
|
| 1.21 | 05-Sep-2009 |
dsl | Now we have our own radix_sort() change the interface so that we pass an array of 'RECHEADER *' and remove all the crappy stuff that backed up by REC_DATA_OFFSET (etc). Also change radix_sort() to return the number of elements, soon to be used to drop duplicate keys (for sort -u).
|
| 1.20 | 22-Aug-2009 |
dsl | Rework the way sort generates sort keys: - If we generate a key, it is always sortable using memcmp() - If we are sorting the whole record, then a weight-table must be used during compares. - Major surgery to encoding of numbers to ensure unique keys for equal numeric values. Reverse numerics are handled by inverting the sign. - Case folding (-f) is handled when the sort keys are generated. No other code has to care at all. - Key uniqueness (-u) is done during merge for large datasets. It only has to be done when writing the output file for small files. Since the file is in key order this is simple! Probably fixes all of: PR/27257 PR/25551 PR/22182 PR/31095 PR/30504 PR/36816 PR/37860 PR/39308 Also PR/18614 should no longer die, but a little more work needs to be done on the merging for very large files.
|
| 1.19 | 20-Aug-2009 |
dsl | Delete more unwanted/unused cruft. Simplify logic for reading input records. Do a merge sort whenever we have 16 partial sorted blocks. The patient is breathing, but still carrying a lot of extra weight.
|
| 1.18 | 18-Aug-2009 |
dsl | The code that attempted to sort large files by sorting each chunk by the first key byte and writing to a temp file, then sorting the records from each temp file that had the same first key byte (and repeating for upto 4 key bytes) was a nice idea, but completely doomed to failure. Eg PR/9308 where a 70MB file has all but one record the same and short keys. Not only does the code not work, it is rather guaranteed to be slow. Instead always use a merge sort for fully sorted chunk of records (each temporary file contains one lot of sorted records). The -H option already did this, so just rip out all the code and variables that can't be used when -H was specified. Further cleanup to come ...
|
| 1.17 | 16-Aug-2009 |
dsl | 'depth' is used for the number of bytes into the key that the pointers reference, when we want to find the record header put the larger value into 'hdr_off' to avoid any confusion that the code might be changing 'depth'! There is now no need to save the original value as 'odepth' in append.c. All an a vague attempt to make this code slightly readable.
|
| 1.16 | 16-Aug-2009 |
dsl | Replace all uses of sizeof(TRECHEADER) with REC_DATA_OFFSET - which is defined as offsetof(RECHEADER, data). Delete TRECHEADER.
|
| 1.15 | 15-Aug-2009 |
dsl | Ansify. I'm looking at fixing the 'sort -n' fubars, but this code is an inpeneterable mess - which needs some fixing first!
|
| 1.14 | 28-Apr-2008 |
martin | branches: 1.14.6; 1.14.12; Remove clause 3 and 4 from TNF licenses
|
| 1.13 | 15-Feb-2004 |
jdolecek | branches: 1.13.32; fix some cases of use of unitialized variables
|
| 1.12 | 07-Aug-2003 |
jdolecek | add TNF copyright
|
| 1.11 | 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
| 1.10 | 19-Feb-2001 |
jdolecek | Pull up various cosmetic (mostly whitespace) changes from OpenBSD. This is primarily to ease syncing the two versions.
|
| 1.9 | 18-Jan-2001 |
jdolecek | cosmetic style change
|
| 1.8 | 11-Jan-2001 |
jdolecek | general cleanup of file list passing: * get rid of union f_handle, replace by passing explicit int parameter and (new) struct filelist * add new typedefs gen_func_t and put_func_t and use where appropriate
|
| 1.7 | 08-Jan-2001 |
jdolecek | by default, use stable sort add -S flag to switch to non-stable sort; for GNU sort compatibility, provide -s flag too
|
| 1.6 | 17-Oct-2000 |
jdolecek | fix bugs caused by implicit assumption that 'length' and 'offset' members of struct recheader/trecheader are shorts - they are size_t now this makes sort pass all tests in TEST/stests again after my last change
other misc cosmetic changes
|
| 1.5 | 16-Oct-2000 |
jdolecek | constify
|
| 1.4 | 15-Oct-2000 |
jdolecek | don't use register declarations
|
| 1.3 | 07-Oct-2000 |
bjh21 | Two classes of changes from the initial OpenBSD commit of this sort(1): FILE * variables are called "fp" rather than "fd". Better (safer) temporary-file handling.
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.13.32.1 | 18-May-2008 |
yamt | sync with head.
|
| 1.14.12.1 | 21-Apr-2010 |
matt | sync to netbsd-5
|
| 1.14.6.1 | 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|
| 1.7 | 24-Dec-2002 |
jdolecek | put contents of extern.h directly to sort.h, and g/c extern.h de-__P()
|
| 1.6 | 19-Feb-2001 |
jdolecek | Pull up various cosmetic (mostly whitespace) changes from OpenBSD. This is primarily to ease syncing the two versions.
|
| 1.5 | 12-Jan-2001 |
jdolecek | comsetic prototype adjustment
|
| 1.4 | 11-Jan-2001 |
jdolecek | general cleanup of file list passing: * get rid of union f_handle, replace by passing explicit int parameter and (new) struct filelist * add new typedefs gen_func_t and put_func_t and use where appropriate
|
| 1.3 | 16-Oct-2000 |
jdolecek | constify, prototype for seq() moved to files.c
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.33 | 20-Jan-2013 |
apb | When parsing numbers, allow a leading '+'.
|
| 1.32 | 18-Dec-2010 |
christos | branches: 1.32.6; 1.32.12; Add an 'l' style for sorting that sorts by the string length of the field.
|
| 1.31 | 06-Nov-2009 |
joerg | Retire __SCCSID. It has only archeological value now. Also retire lint conditional around __RCSID, lint can handle that fine.
|
| 1.30 | 07-Oct-2009 |
dsl | When encoding numbers, we can use all 8 bits for exponent values.
|
| 1.29 | 16-Sep-2009 |
dsl | Minor tweaks to the key generation for numeric fields. Use 1's compliment for -ve numbers to avoid confitionals.
|
| 1.28 | 10-Sep-2009 |
dsl | Save length of key instead of relying of the weight of the record sep. This frees a byte value to use for 'end of key' (to correctly sort short keys) while still having a weight assigned to the field sep. (Unless -t is given, the field sep is in the field data.) Do reverse sorts by writing the output file in reverse order (rather than reversing the sort - apart from merges). All key compares are now unweighted. For 'sort -u' mark duplicates keys during the sort and don't write to the output. Use -S to mean a posix sort - where equal keys are sorted using the raw record (rather than being kept in the original order). For 'sort -f' (no keys) generate a key of the folded data (as for -n -i and -d), simplifies the code and allows a 'posix' sort.
|
| 1.27 | 22-Aug-2009 |
dsl | Fix generation of unmasked alpha keys.
|
| 1.26 | 22-Aug-2009 |
dsl | Only process each number digit once.
|
| 1.25 | 22-Aug-2009 |
dsl | Rework the way sort generates sort keys: - If we generate a key, it is always sortable using memcmp() - If we are sorting the whole record, then a weight-table must be used during compares. - Major surgery to encoding of numbers to ensure unique keys for equal numeric values. Reverse numerics are handled by inverting the sign. - Case folding (-f) is handled when the sort keys are generated. No other code has to care at all. - Key uniqueness (-u) is done during merge for large datasets. It only has to be done when writing the output file for small files. Since the file is in key order this is simple! Probably fixes all of: PR/27257 PR/25551 PR/22182 PR/31095 PR/30504 PR/36816 PR/37860 PR/39308 Also PR/18614 should no longer die, but a little more work needs to be done on the merging for very large files.
|
| 1.24 | 20-Aug-2009 |
dsl | Delete more unwanted/unused cruft. Simplify logic for reading input records. Do a merge sort whenever we have 16 partial sorted blocks. The patient is breathing, but still carrying a lot of extra weight.
|
| 1.23 | 15-Aug-2009 |
dsl | Always add an REC_D char (usually \n) as the last sort key char - we almost always need one. But do ADD it, instead of overwriting the last byte of the last key since that may be requesting the other end of the sort order. There is no need to check for space for the line after adding the key, but we might as well check before - just to optimise that case. This might fix some of the sort bugs - but not the one I'm looking at!
|
| 1.22 | 15-Aug-2009 |
dsl | Remove reference to db.h by using separate ptr+len fields for the only structure that used it. Pass end of keybuf area, not size to enterkey() - largely to remove a variable who'se use isn't obvious from the name! The structute of this code sucks.
|
| 1.21 | 15-Aug-2009 |
dsl | Ansify. I'm looking at fixing the 'sort -n' fubars, but this code is an inpeneterable mess - which needs some fixing first!
|
| 1.20 | 13-Apr-2009 |
lukem | Fix WARNS=4 issues (-Wcast-qual -Wsign-compare)
|
| 1.19 | 28-Apr-2008 |
martin | branches: 1.19.6; 1.19.8; 1.19.12; Remove clause 3 and 4 from TNF licenses
|
| 1.18 | 14-Mar-2004 |
heas | branches: 1.18.32; Do not step over the edge of the buffer (check for '\0'). This just happens to not lose on i386 because another buffer appears immediately following. Regress tests all passed.
|
| 1.17 | 15-Feb-2004 |
jdolecek | make sure zero is recognized as regular number in number(), and thus sorted properly with -n fixes PR bin/20259 by Giles Lean, PR bin/20542 by Peter Seebach, and part of PR bin/24316 by MLH
|
| 1.16 | 15-Feb-2004 |
jdolecek | fix -Wunitialized warnings
|
| 1.15 | 18-Oct-2003 |
itojun | KNF (mostly whitespace)
|
| 1.14 | 07-Aug-2003 |
jdolecek | add TNF copyright
|
| 1.13 | 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
| 1.12 | 09-Apr-2003 |
jdolecek | rename local macro blancmange() to SKIP_BLANKS(), to clarify what it does and to better signal it might modify it's arguments fixes PR bin/20546 by Peter Seebach
|
| 1.11 | 24-Dec-2002 |
jdolecek | add extern definition for ncols and clist[] to sort.h, eliminate extra definitions in init.c and field.c g/c MAXMERGE
|
| 1.10 | 19-Feb-2001 |
jdolecek | Pull up various cosmetic (mostly whitespace) changes from OpenBSD. This is primarily to ease syncing the two versions.
|
| 1.9 | 19-Feb-2001 |
jdolecek | enterfield(): test the buffer size BEFORE assignment also for the other code branch, since we might get called with tablepos == endkey for some special input files (where an record would happen to fit exactly to the input buffer) - BTW, this bug looks like it has been here ~forever ...
This seems to fix the sort crash for 'make british' build of ispell package, as reported by Mark White at current-users@.
|
| 1.8 | 19-Feb-2001 |
jdolecek | enterkey(): * move the test for keybuf size before keypos[-1] assignment, "just in case" * move the keypos assignment to improve readability
|
| 1.7 | 13-Jan-2001 |
jdolecek | also remove the clpos++ added in rev 1.4
|
| 1.6 | 13-Jan-2001 |
jdolecek | undo broken revision 1.4
|
| 1.5 | 12-Jan-2001 |
jdolecek | for stable sort, arrange so that really only relevant part of line is used for sort - this makes sort pass regression test number 36
while here, slighly adjust code formating on couple of places
|
| 1.4 | 17-Oct-2000 |
jdolecek | cosmetic change in way one of for variables is updated
|
| 1.3 | 15-Oct-2000 |
jdolecek | don't use register declarations
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.18.32.1 | 18-May-2008 |
yamt | sync with head.
|
| 1.19.12.1 | 21-Apr-2010 |
matt | sync to netbsd-5
|
| 1.19.8.1 | 13-May-2009 |
jym | Sync with HEAD.
Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
|
| 1.19.6.1 | 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|
| 1.32.12.1 | 25-Feb-2013 |
tls | resync with head
|
| 1.32.6.1 | 23-Jan-2013 |
yamt | sync with head
|
| 1.43 | 10-Aug-2023 |
mrg | avoid various use-after-free issues.
create a ptrdiff_t offset between the start of an allocation region and some interesting pointer, so it can be adjusted with this offset after realloc() returns. for pdisk(), realloc() is a locally inlind malloc() and free() pair.
for mail(1), this required a little bit more effort as the old pointer was passed into another file for fix-ups there, and that code needed to be adjusted for offset vs old pointer usage.
found by GCC 12.
|
| 1.42 | 05-Aug-2015 |
mrg | add a description about what was being attempted to failed writes messages.
|
| 1.41 | 06-Nov-2009 |
joerg | Retire __SCCSID. It has only archeological value now. Also retire lint conditional around __RCSID, lint can handle that fine.
|
| 1.40 | 07-Oct-2009 |
dsl | long align records written to temporary files.
|
| 1.39 | 28-Sep-2009 |
dsl | Fix borked fix for sort relying on realloc() changing the buffer end. Sorts of more than 8MB data now probably work again.
|
| 1.38 | 26-Sep-2009 |
dsl | Move all the fopen() calls out of the record read routines into the callers. Split the merge sort so that fsort() can pass the 'FILE *' of the temporary files to be merged into the merge code. Don't rely on realloc() not moving the end address of a buffer! Rework merge sort so that it sorts pointers to 'struct mfile' and only copies about sort record descriptors. No functional change intended.
|
| 1.37 | 10-Sep-2009 |
dsl | Save length of key instead of relying of the weight of the record sep. This frees a byte value to use for 'end of key' (to correctly sort short keys) while still having a weight assigned to the field sep. (Unless -t is given, the field sep is in the field data.) Do reverse sorts by writing the output file in reverse order (rather than reversing the sort - apart from merges). All key compares are now unweighted. For 'sort -u' mark duplicates keys during the sort and don't write to the output. Use -S to mean a posix sort - where equal keys are sorted using the raw record (rather than being kept in the original order). For 'sort -f' (no keys) generate a key of the folded data (as for -n -i and -d), simplifies the code and allows a 'posix' sort.
|
| 1.36 | 05-Sep-2009 |
dsl | Now we have our own radix_sort() change the interface so that we pass an array of 'RECHEADER *' and remove all the crappy stuff that backed up by REC_DATA_OFFSET (etc). Also change radix_sort() to return the number of elements, soon to be used to drop duplicate keys (for sort -u).
|
| 1.35 | 22-Aug-2009 |
dsl | Rework the way sort generates sort keys: - If we generate a key, it is always sortable using memcmp() - If we are sorting the whole record, then a weight-table must be used during compares. - Major surgery to encoding of numbers to ensure unique keys for equal numeric values. Reverse numerics are handled by inverting the sign. - Case folding (-f) is handled when the sort keys are generated. No other code has to care at all. - Key uniqueness (-u) is done during merge for large datasets. It only has to be done when writing the output file for small files. Since the file is in key order this is simple! Probably fixes all of: PR/27257 PR/25551 PR/22182 PR/31095 PR/30504 PR/36816 PR/37860 PR/39308 Also PR/18614 should no longer die, but a little more work needs to be done on the merging for very large files.
|
| 1.34 | 18-Aug-2009 |
dsl | The code that attempted to sort large files by sorting each chunk by the first key byte and writing to a temp file, then sorting the records from each temp file that had the same first key byte (and repeating for upto 4 key bytes) was a nice idea, but completely doomed to failure. Eg PR/9308 where a 70MB file has all but one record the same and short keys. Not only does the code not work, it is rather guaranteed to be slow. Instead always use a merge sort for fully sorted chunk of records (each temporary file contains one lot of sorted records). The -H option already did this, so just rip out all the code and variables that can't be used when -H was specified. Further cleanup to come ...
|
| 1.33 | 16-Aug-2009 |
dsl | Replace all uses of sizeof(TRECHEADER) with REC_DATA_OFFSET - which is defined as offsetof(RECHEADER, data). Delete TRECHEADER.
|
| 1.32 | 15-Aug-2009 |
dsl | Remove reference to db.h by using separate ptr+len fields for the only structure that used it. Pass end of keybuf area, not size to enterkey() - largely to remove a variable who'se use isn't obvious from the name! The structute of this code sucks.
|
| 1.31 | 15-Aug-2009 |
dsl | linebuf and linebuf_size are only used inside seq() - which also not only has its own static variable, but will also extend the buffer. Remove linebuf/size and change seq() to use a private, locally managed buffer.
|
| 1.30 | 15-Aug-2009 |
dsl | Remove the unused 'DBT *key' parameter from seq().
|
| 1.29 | 15-Aug-2009 |
dsl | In makeline() change 'pos' from 'char *' to 'u_char *' and remove all the casts associated with its use. None of the uses can possibly care about the signedness of the pointer.
|
| 1.28 | 15-Aug-2009 |
dsl | Ansify. I'm looking at fixing the 'sort -n' fubars, but this code is an inpeneterable mess - which needs some fixing first!
|
| 1.27 | 13-Apr-2009 |
lukem | Fix WARNS=4 issues (-Wcast-qual -Wsign-compare)
|
| 1.26 | 28-Apr-2008 |
martin | branches: 1.26.6; 1.26.8; 1.26.12; Remove clause 3 and 4 from TNF licenses
|
| 1.25 | 11-May-2006 |
mrg | branches: 1.25.20; char -> u_char in a couple of places to match other variables.
|
| 1.24 | 07-Jun-2005 |
he | Initialize a local variable to appease -Wuninitialized. Marked with XXXGCC for pmppc (found while compiling for it).
Reviewed by lukem.
|
| 1.23 | 15-Feb-2004 |
jdolecek | fix some cases of use of unitialized variables
|
| 1.22 | 18-Oct-2003 |
itojun | KNF (mostly whitespace)
|
| 1.21 | 16-Oct-2003 |
itojun | safer use of realloc
|
| 1.20 | 07-Aug-2003 |
jdolecek | add TNF copyright
|
| 1.19 | 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
| 1.18 | 24-Dec-2002 |
jdolecek | max_o in struct tempfile needs to be off_t use fseeko() rather than fseek() when changing file offset using max_o
|
| 1.17 | 15-May-2001 |
jdolecek | Make compilable with -Wshadow
|
| 1.16 | 19-Feb-2001 |
jdolecek | Pull up various cosmetic (mostly whitespace) changes from OpenBSD. This is primarily to ease syncing the two versions.
|
| 1.15 | 19-Feb-2001 |
jdolecek | resurrect old ftmp() - it supports alternative directory for temporary file, which is needed for -T support
|
| 1.14 | 18-Jan-2001 |
jdolecek | makeline(): make the overflow handling code safe vs. buffer realloc, add a comment explaining what we do here
|
| 1.13 | 13-Jan-2001 |
jdolecek | makeline(): put back the memmove(3) removed in rev 1.5 in belief it's been redundant. "Oops" This fixes bug reported to me by Simon Burge.
|
| 1.12 | 13-Jan-2001 |
itojun | fix few confusing indentation. XXX still broken
|
| 1.11 | 13-Jan-2001 |
jdolecek | one more warning to kill
|
| 1.10 | 13-Jan-2001 |
jdolecek | Since SUS explicitly specifies sort(1) should append a record delimiter to file if it doesn't end with one, don't warn when this happens.
|
| 1.9 | 12-Jan-2001 |
jdolecek | remove #if 0 part
|
| 1.8 | 11-Jan-2001 |
jdolecek | general cleanup of file list passing: * get rid of union f_handle, replace by passing explicit int parameter and (new) struct filelist * add new typedefs gen_func_t and put_func_t and use where appropriate
|
| 1.7 | 08-Jan-2001 |
jdolecek | make ftmp() wrapper aroung tmpfile(), there is no need to reimplement it move ftmp() from tmp.c to files.c g/c no longer needed stuff
|
| 1.6 | 17-Oct-2000 |
jdolecek | fix bugs caused by implicit assumption that 'length' and 'offset' members of struct recheader/trecheader are shorts - they are size_t now this makes sort pass all tests in TEST/stests again after my last change
other misc cosmetic changes
|
| 1.5 | 16-Oct-2000 |
jdolecek | enlarge line buffer as necessary, so that it's possible to process lines longer than 65522 characters constify, rename MAXLLEN to DEFLLEN
|
| 1.4 | 15-Oct-2000 |
jdolecek | don't use register declarations
|
| 1.3 | 07-Oct-2000 |
bjh21 | Two classes of changes from the initial OpenBSD commit of this sort(1): FILE * variables are called "fp" rather than "fd". Better (safer) temporary-file handling.
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.25.20.1 | 18-May-2008 |
yamt | sync with head.
|
| 1.26.12.1 | 21-Apr-2010 |
matt | sync to netbsd-5
|
| 1.26.8.1 | 13-May-2009 |
jym | Sync with HEAD.
Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
|
| 1.26.6.1 | 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|
| 1.47 | 05-Feb-2010 |
enami | Don't touch past the end of allocated region. It results segmentation violation.
|
| 1.46 | 06-Nov-2009 |
joerg | Retire __SCCSID. It has only archeological value now. Also retire lint conditional around __RCSID, lint can handle that fine.
|
| 1.45 | 09-Oct-2009 |
dsl | If anyone is stupid enough to feed records longer than 8MB into sort, don't sit in an infinite loop, instead eat memory until we have read 8 records.
|
| 1.44 | 09-Oct-2009 |
dsl | Don't give merge an empty file when we detect EOF with nothing in our buffer.
|
| 1.43 | 28-Sep-2009 |
dsl | Fix borked fix for sort relying on realloc() changing the buffer end. Sorts of more than 8MB data now probably work again.
|
| 1.42 | 26-Sep-2009 |
dsl | Move all the fopen() calls out of the record read routines into the callers. Split the merge sort so that fsort() can pass the 'FILE *' of the temporary files to be merged into the merge code. Don't rely on realloc() not moving the end address of a buffer! Rework merge sort so that it sorts pointers to 'struct mfile' and only copies about sort record descriptors. No functional change intended.
|
| 1.41 | 10-Sep-2009 |
dsl | Save length of key instead of relying of the weight of the record sep. This frees a byte value to use for 'end of key' (to correctly sort short keys) while still having a weight assigned to the field sep. (Unless -t is given, the field sep is in the field data.) Do reverse sorts by writing the output file in reverse order (rather than reversing the sort - apart from merges). All key compares are now unweighted. For 'sort -u' mark duplicates keys during the sort and don't write to the output. Use -S to mean a posix sort - where equal keys are sorted using the raw record (rather than being kept in the original order). For 'sort -f' (no keys) generate a key of the folded data (as for -n -i and -d), simplifies the code and allows a 'posix' sort.
|
| 1.40 | 05-Sep-2009 |
dsl | Now we have our own radix_sort() change the interface so that we pass an array of 'RECHEADER *' and remove all the crappy stuff that backed up by REC_DATA_OFFSET (etc). Also change radix_sort() to return the number of elements, soon to be used to drop duplicate keys (for sort -u).
|
| 1.39 | 22-Aug-2009 |
dsl | Rework the way sort generates sort keys: - If we generate a key, it is always sortable using memcmp() - If we are sorting the whole record, then a weight-table must be used during compares. - Major surgery to encoding of numbers to ensure unique keys for equal numeric values. Reverse numerics are handled by inverting the sign. - Case folding (-f) is handled when the sort keys are generated. No other code has to care at all. - Key uniqueness (-u) is done during merge for large datasets. It only has to be done when writing the output file for small files. Since the file is in key order this is simple! Probably fixes all of: PR/27257 PR/25551 PR/22182 PR/31095 PR/30504 PR/36816 PR/37860 PR/39308 Also PR/18614 should no longer die, but a little more work needs to be done on the merging for very large files.
|
| 1.38 | 20-Aug-2009 |
dsl | Delete more unwanted/unused cruft. Simplify logic for reading input records. Do a merge sort whenever we have 16 partial sorted blocks. The patient is breathing, but still carrying a lot of extra weight.
|
| 1.37 | 18-Aug-2009 |
dsl | The code that attempted to sort large files by sorting each chunk by the first key byte and writing to a temp file, then sorting the records from each temp file that had the same first key byte (and repeating for upto 4 key bytes) was a nice idea, but completely doomed to failure. Eg PR/9308 where a 70MB file has all but one record the same and short keys. Not only does the code not work, it is rather guaranteed to be slow. Instead always use a merge sort for fully sorted chunk of records (each temporary file contains one lot of sorted records). The -H option already did this, so just rip out all the code and variables that can't be used when -H was specified. Further cleanup to come ...
|
| 1.36 | 16-Aug-2009 |
dsl | 'depth' is used for the number of bytes into the key that the pointers reference, when we want to find the record header put the larger value into 'hdr_off' to avoid any confusion that the code might be changing 'depth'! There is now no need to save the original value as 'odepth' in append.c. All an a vague attempt to make this code slightly readable.
|
| 1.35 | 16-Aug-2009 |
dsl | Replace all uses of sizeof(TRECHEADER) with REC_DATA_OFFSET - which is defined as offsetof(RECHEADER, data). Delete TRECHEADER.
|
| 1.34 | 15-Aug-2009 |
dsl | linebuf and linebuf_size are only used inside seq() - which also not only has its own static variable, but will also extend the buffer. Remove linebuf/size and change seq() to use a private, locally managed buffer.
|
| 1.33 | 15-Aug-2009 |
dsl | Ansify. I'm looking at fixing the 'sort -n' fubars, but this code is an inpeneterable mess - which needs some fixing first!
|
| 1.32 | 28-Apr-2008 |
martin | branches: 1.32.6; 1.32.12; Remove clause 3 and 4 from TNF licenses
|
| 1.31 | 10-Jun-2005 |
jmc | branches: 1.31.20; Init some variables the compiler is complaining about and mark w. XXGCC as it affects only m68k compilers.
|
| 1.30 | 15-Feb-2004 |
jdolecek | fix -Wunitialized warnings
|
| 1.29 | 18-Oct-2003 |
itojun | KNF (mostly whitespace)
|
| 1.28 | 17-Oct-2003 |
enami | Test the value returned by realloc() rather than anything else.
|
| 1.27 | 16-Oct-2003 |
itojun | safer use of realloc
|
| 1.26 | 07-Aug-2003 |
jdolecek | add TNF copyright
|
| 1.25 | 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
| 1.24 | 24-Dec-2002 |
jdolecek | improve previous slightly - need >= (not just >) in CHECKFSTACK()
|
| 1.23 | 24-Dec-2002 |
jdolecek | make sure we don't attempt to write past end of fstack[], error out instead
this fixes second part ('tmpdir get smashed') of bin/18614 by Michael Graff
|
| 1.22 | 10-Oct-2002 |
jdolecek | g/c extern reference to toutpath
|
| 1.21 | 30-Sep-2002 |
enami | Use the right file to output merge result.
|
| 1.20 | 15-May-2001 |
jdolecek | branches: 1.20.2; Only try to copy the extra incomplete record data if there is anything actually read already. Albeit it's not damaging to copy zero data for bufend == crec->data case, the buffer end could also be between memory position 'crec' and 'crec->data'. Thus, we could end up with negative 'bufend - crec->data' value, and obvious havoc.
This change fixes lib/12673, though the problem was masked and no longer repeatable with the provided example after the recent buffer size bump. The change was tested with the buffer size change backed off, and really fixes the problem in the PR.
|
| 1.19 | 15-May-2001 |
jdolecek | fsort(): rearrange the push code to reduce one level of intendation, free keylist, buffer on end of work; no functional changes
|
| 1.18 | 14-May-2001 |
jdolecek | Bump the initial record buffer size to 1MB and allow it to grow to 8MB, if needed and record count is within bounds (<MAXNUM), rather than sorting the input by 64KB chunks. This cuts the number of needed temporary files considerably (and improves performance, too). Slighly adjust some #defines, mostly to power of 2 values.
This addresses bin/12673 and bin/12614, as well as complains from other people.
|
| 1.17 | 20-Feb-2001 |
jdolecek | fsort(): don't call append() with zero nelem This fixes the 'sort -f /dev/null' coredump reported on current-users.
|
| 1.16 | 19-Feb-2001 |
jdolecek | Pull up various cosmetic (mostly whitespace) changes from OpenBSD. This is primarily to ease syncing the two versions.
|
| 1.15 | 19-Feb-2001 |
jdolecek | oops - wrong file, backoff local test change
|
| 1.14 | 19-Feb-2001 |
jdolecek | enterkey(): * move the test for keybuf size before keypos[-1] assignment, "just in case" * move the keypos assignment to improve readability
|
| 1.13 | 19-Feb-2001 |
jdolecek | cosmetic changes - make keylist[] static and remove extern definition in fsort.h, move macro SALIGN() from sort.h to fsort.c
|
| 1.12 | 05-Feb-2001 |
itojun | make sure to initialize malloc'ed region. PR 12138. found by malloc.conf=AJ
|
| 1.11 | 19-Jan-2001 |
jdolecek | use MERGE_FNUM instead of magic value 16
|
| 1.10 | 18-Jan-2001 |
jdolecek | keep bumping the record buffer up to 8 records - this is to avoid making excessive number of temporary files for oversized records; the way the buffer is enlarged is now also safer
initialize 'bufsize' statically, so that the value can be safely used in e.g. msort.c:fmerge()
|
| 1.9 | 13-Jan-2001 |
itojun | fix few confusing indentation. XXX still broken
|
| 1.8 | 11-Jan-2001 |
jdolecek | general cleanup of file list passing: * get rid of union f_handle, replace by passing explicit int parameter and (new) struct filelist * add new typedefs gen_func_t and put_func_t and use where appropriate
|
| 1.7 | 08-Jan-2001 |
jdolecek | by default, use stable sort add -S flag to switch to non-stable sort; for GNU sort compatibility, provide -s flag too
|
| 1.6 | 17-Oct-2000 |
jdolecek | fix bugs caused by implicit assumption that 'length' and 'offset' members of struct recheader/trecheader are shorts - they are size_t now this makes sort pass all tests in TEST/stests again after my last change
other misc cosmetic changes
|
| 1.5 | 16-Oct-2000 |
jdolecek | enlarge line buffer as necessary, so that it's possible to process lines longer than 65522 characters constify, rename MAXLLEN to DEFLLEN
|
| 1.4 | 15-Oct-2000 |
jdolecek | don't use register declarations
|
| 1.3 | 07-Oct-2000 |
bjh21 | Two classes of changes from the initial OpenBSD commit of this sort(1): FILE * variables are called "fp" rather than "fd". Better (safer) temporary-file handling.
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.20.2.1 | 01-Oct-2002 |
lukem | Pull up revision 1.21 (requested by enami in ticket #883): Use the right file to output merge result.
|
| 1.31.20.1 | 18-May-2008 |
yamt | sync with head.
|
| 1.32.12.2 | 20-May-2011 |
matt | bring matt-nb5-mips64 up to date with netbsd-5-1-RELEASE
|
| 1.32.12.1 | 21-Apr-2010 |
matt | sync to netbsd-5
|
| 1.32.6.2 | 29-Jun-2010 |
riz | Pull up following revision(s) (requested by dholland in ticket #1420): usr.bin/sort/sort.h: revision 1.31 usr.bin/sort/sort.c: revision 1.58 usr.bin/sort/fsort.c: revision 1.47 usr.bin/sort/msort.c: revision 1.30 Don't touch past the end of allocated region. It results segmentation violation.
|
| 1.32.6.1 | 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|
| 1.18 | 25-Oct-2023 |
simonb | Correct a comment - 8 * 1 million is 8 million, not 10 million (!).
|
| 1.17 | 26-Sep-2009 |
dsl | Move all the fopen() calls out of the record read routines into the callers. Split the merge sort so that fsort() can pass the 'FILE *' of the temporary files to be merged into the merge code. Don't rely on realloc() not moving the end address of a buffer! Rework merge sort so that it sorts pointers to 'struct mfile' and only copies about sort record descriptors. No functional change intended.
|
| 1.16 | 05-Sep-2009 |
dsl | Now we have our own radix_sort() change the interface so that we pass an array of 'RECHEADER *' and remove all the crappy stuff that backed up by REC_DATA_OFFSET (etc). Also change radix_sort() to return the number of elements, soon to be used to drop duplicate keys (for sort -u).
|
| 1.15 | 20-Aug-2009 |
dsl | Delete more unwanted/unused cruft. Simplify logic for reading input records. Do a merge sort whenever we have 16 partial sorted blocks. The patient is breathing, but still carrying a lot of extra weight.
|
| 1.14 | 15-Aug-2009 |
dsl | linebuf and linebuf_size are only used inside seq() - which also not only has its own static variable, but will also extend the buffer. Remove linebuf/size and change seq() to use a private, locally managed buffer.
|
| 1.13 | 28-Apr-2008 |
martin | branches: 1.13.6; 1.13.12; Remove clause 3 and 4 from TNF licenses
|
| 1.12 | 07-Aug-2003 |
jdolecek | branches: 1.12.32; add TNF copyright
|
| 1.11 | 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
| 1.10 | 24-Dec-2002 |
jdolecek | max_o in struct tempfile needs to be off_t use fseeko() rather than fseek() when changing file offset using max_o
|
| 1.9 | 14-May-2001 |
jdolecek | Bump the initial record buffer size to 1MB and allow it to grow to 8MB, if needed and record count is within bounds (<MAXNUM), rather than sorting the input by 64KB chunks. This cuts the number of needed temporary files considerably (and improves performance, too). Slighly adjust some #defines, mostly to power of 2 values.
This addresses bin/12673 and bin/12614, as well as complains from other people.
|
| 1.8 | 19-Feb-2001 |
jdolecek | Pull up various cosmetic (mostly whitespace) changes from OpenBSD. This is primarily to ease syncing the two versions.
|
| 1.7 | 19-Feb-2001 |
jdolecek | cosmetic changes - make keylist[] static and remove extern definition in fsort.h, move macro SALIGN() from sort.h to fsort.c
|
| 1.6 | 19-Jan-2001 |
jdolecek | put MERGE_FNUM here, slighly clean up other defines
|
| 1.5 | 18-Jan-2001 |
jdolecek | make DEFLLEN plain 1 << 16, don't substract magic value
|
| 1.4 | 16-Oct-2000 |
jdolecek | enlarge line buffer as necessary, so that it's possible to process lines longer than 65522 characters constify, rename MAXLLEN to DEFLLEN
|
| 1.3 | 07-Oct-2000 |
bjh21 | Two classes of changes from the initial OpenBSD commit of this sort(1): FILE * variables are called "fp" rather than "fd". Better (safer) temporary-file handling.
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.12.32.1 | 18-May-2008 |
yamt | sync with head.
|
| 1.13.12.1 | 21-Apr-2010 |
matt | sync to netbsd-5
|
| 1.13.6.1 | 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|
| 1.30 | 19-Sep-2021 |
andvar | fix few more typos in comments, messages and documentation.
|
| 1.29 | 18-Oct-2013 |
christos | fix unused variable warnings
|
| 1.28 | 18-Dec-2010 |
christos | branches: 1.28.6; 1.28.12; Add an 'l' style for sorting that sorts by the string length of the field.
|
| 1.27 | 06-Jun-2010 |
wiz | Fix typo in comment.
|
| 1.26 | 05-Jun-2010 |
dholland | Rework previous change to fixit() to not trip on option arguments. (Noticed by wiz.) Clarify the loop logic involved.
|
| 1.25 | 27-May-2010 |
dholland | Don't recognize "+3" after -- or after the first non-option argument. This prevents converting "+3" into "-k4.1" in places where getopt won't recognize it, which in turn prevents silly error messages and lossage trying to sort files whose names begin with +. PR 43358.
|
| 1.24 | 06-Nov-2009 |
joerg | Retire __SCCSID. It has only archeological value now. Also retire lint conditional around __RCSID, lint can handle that fine.
|
| 1.23 | 10-Sep-2009 |
dsl | Save length of key instead of relying of the weight of the record sep. This frees a byte value to use for 'end of key' (to correctly sort short keys) while still having a weight assigned to the field sep. (Unless -t is given, the field sep is in the field data.) Do reverse sorts by writing the output file in reverse order (rather than reversing the sort - apart from merges). All key compares are now unweighted. For 'sort -u' mark duplicates keys during the sort and don't write to the output. Use -S to mean a posix sort - where equal keys are sorted using the raw record (rather than being kept in the original order). For 'sort -f' (no keys) generate a key of the folded data (as for -n -i and -d), simplifies the code and allows a 'posix' sort.
|
| 1.22 | 05-Sep-2009 |
dsl | Include a local copy of the sradixsort() code from libc. Currently unchanged apart from the deletion of the 'unstable' version and other unneeded code. Use fldtab[0]. not fldtab-> when we are referring to the global info in the 0th entry to emphasise that this entry is different. fldtab[0].weights is only needed in the SINGL_FLD case - so set it there. Re-indent a big 'if' is setfield() so that the line breaks match the logic - which looks dubious now!
|
| 1.21 | 22-Aug-2009 |
dsl | <space> and <tab> at the start of key fields are supposed to be sorted as if part of the data. This is a bit fubar since we need a value than sorts before any byte value as a key field separator - so need 257 byte values (since radixsort() doesn't take a length for each record). For now map '\t' to 0x01 and hope no one will notice!
|
| 1.20 | 22-Aug-2009 |
dsl | Rework the way sort generates sort keys: - If we generate a key, it is always sortable using memcmp() - If we are sorting the whole record, then a weight-table must be used during compares. - Major surgery to encoding of numbers to ensure unique keys for equal numeric values. Reverse numerics are handled by inverting the sign. - Case folding (-f) is handled when the sort keys are generated. No other code has to care at all. - Key uniqueness (-u) is done during merge for large datasets. It only has to be done when writing the output file for small files. Since the file is in key order this is simple! Probably fixes all of: PR/27257 PR/25551 PR/22182 PR/31095 PR/30504 PR/36816 PR/37860 PR/39308 Also PR/18614 should no longer die, but a little more work needs to be done on the merging for very large files.
|
| 1.19 | 15-Aug-2009 |
dsl | Ansify. I'm looking at fixing the 'sort -n' fubars, but this code is an inpeneterable mess - which needs some fixing first!
|
| 1.18 | 28-Apr-2008 |
martin | branches: 1.18.6; 1.18.12; Remove clause 3 and 4 from TNF licenses
|
| 1.17 | 23-Oct-2006 |
jdolecek | branches: 1.17.16; fix check for field order to allow .0 form in "-k 1.2,1.0"
fix provided in PR bin/25572 by Ross Patterson
|
| 1.16 | 03-Nov-2004 |
dsl | Add (unsigned char) cast to ctype functions
|
| 1.15 | 18-Feb-2004 |
jdolecek | insertcol() may insert up to two items to clist, so allocate memory accordingly this fixes sort regression test 28A and 28B
|
| 1.14 | 17-Feb-2004 |
jdolecek | fix parsing of some +POS -POS variants, as pointed out by sort regression tests
|
| 1.13 | 17-Feb-2004 |
itojun | safer realloc idiom minor knf
|
| 1.12 | 15-Feb-2004 |
jdolecek | remove compile-time limit on number of -k options, allocate necessary structures as-needed
|
| 1.11 | 15-Feb-2004 |
jdolecek | rewrite fixit() to duplicate less code, and comment the contents better; also removes compile-time dependancy on ND constant
|
| 1.10 | 15-Feb-2004 |
jdolecek | g/c redundant setfield() prototype clear setcolumn() somewhat - use strtol() instead of sscanf(), and simplify flag setting code
|
| 1.9 | 07-Aug-2003 |
jdolecek | add TNF copyright
|
| 1.8 | 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
| 1.7 | 24-Dec-2002 |
jdolecek | add extern definition for ncols and clist[] to sort.h, eliminate extra definitions in init.c and field.c g/c MAXMERGE
|
| 1.6 | 31-Dec-2001 |
thorpej | Change some:
foo += sscanf(++foo, ...);
constructs to:
++foo; foo += sscanf(foo, ...);
to avoid the following warning from gcc 3.1:
warning: operation on `pos' may be undefined
|
| 1.5 | 19-Feb-2001 |
jdolecek | Pull up various cosmetic (mostly whitespace) changes from OpenBSD. This is primarily to ease syncing the two versions.
|
| 1.4 | 12-Jan-2001 |
jdolecek | use toupper() where appropriate whitespace/parenthesis police
|
| 1.3 | 16-Oct-2000 |
jdolecek | cosmetic change: make setcolumn() static, remove bogus redundant setcolumn() prototype inside setcolumn() function, constify
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.17.16.1 | 18-May-2008 |
yamt | sync with head.
|
| 1.18.12.1 | 21-Apr-2010 |
matt | sync to netbsd-5
|
| 1.18.6.1 | 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|
| 1.28.12.1 | 20-Aug-2014 |
tls | Rebase to HEAD as of a few days ago.
|
| 1.28.6.1 | 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
| 1.31 | 01-Jun-2016 |
kre | Add the posix -C option (-c but quieter). Fix -R to work properly when setting \n as the record delimited using a numeric value rather than literal \n - and to not incorrectly turn \n into a field separator if -R is used to make some other char the record separator (\n becomes a field separator in that case as long as the field separator remains "white space" but should not be in any other case - unless set explicitly of course.)
Plus more cosmetic changes - the man page and usage are updated to make it more clear that the 2 (or 1) params to -k are not fields (field1 and field2) but specifiers of the beginning and end of one key field. There was an unused 'x' option in the GETOPTS string. The usage message is reformatted to display properly on both 80 col and > 80 col displays (on < 80 it will still probably look pretty ugly ... perhaps not quite so bad though), and is also updated to show the different usage for the -c case (and -C) from the others (only 1 file permitted) - the man page synopsis has a similar update.
Using more than one of -c -C or -m generates a usage message rather than just ignoring the -m as it did before (there was no -C before of course).
Aside from the bug fix to the interaction between -R and -t, there are no changes that affect the way anything is sorted (or read, or written).
Discussed on tech-userlevel earlier this week.
|
| 1.30 | 05-Feb-2010 |
enami | Don't touch past the end of allocated region. It results segmentation violation.
|
| 1.29 | 06-Nov-2009 |
joerg | Retire __SCCSID. It has only archeological value now. Also retire lint conditional around __RCSID, lint can handle that fine.
|
| 1.28 | 09-Oct-2009 |
dsl | When we need to merge more than 16 files, do them in a hierarchy. Reduces the amount of data written to temporary files. The 3-level stack has to do a simple reduce after 4352 input files, for a normal file sort this is 35GB of data or about 500 million records. This needs about 50 open fd's - which should be ok. Clearly the merge sort could process more input files in one go - speeding up the sort, but at some point the number of input files would exceed whatever limit was applied.
|
| 1.27 | 26-Sep-2009 |
dsl | Move all the fopen() calls out of the record read routines into the callers. Split the merge sort so that fsort() can pass the 'FILE *' of the temporary files to be merged into the merge code. Don't rely on realloc() not moving the end address of a buffer! Rework merge sort so that it sorts pointers to 'struct mfile' and only copies about sort record descriptors. No functional change intended.
|
| 1.26 | 10-Sep-2009 |
dsl | Save length of key instead of relying of the weight of the record sep. This frees a byte value to use for 'end of key' (to correctly sort short keys) while still having a weight assigned to the field sep. (Unless -t is given, the field sep is in the field data.) Do reverse sorts by writing the output file in reverse order (rather than reversing the sort - apart from merges). All key compares are now unweighted. For 'sort -u' mark duplicates keys during the sort and don't write to the output. Use -S to mean a posix sort - where equal keys are sorted using the raw record (rather than being kept in the original order). For 'sort -f' (no keys) generate a key of the folded data (as for -n -i and -d), simplifies the code and allows a 'posix' sort.
|
| 1.25 | 05-Sep-2009 |
dsl | Now we have our own radix_sort() change the interface so that we pass an array of 'RECHEADER *' and remove all the crappy stuff that backed up by REC_DATA_OFFSET (etc). Also change radix_sort() to return the number of elements, soon to be used to drop duplicate keys (for sort -u).
|
| 1.24 | 22-Aug-2009 |
dsl | Add some comments and clarifications to this inpeneterable code. When merging ensure we accurable sort records with identical keys by file-number, otherwise a 'stable' sort won't be!
|
| 1.23 | 22-Aug-2009 |
dsl | Rework the way sort generates sort keys: - If we generate a key, it is always sortable using memcmp() - If we are sorting the whole record, then a weight-table must be used during compares. - Major surgery to encoding of numbers to ensure unique keys for equal numeric values. Reverse numerics are handled by inverting the sign. - Case folding (-f) is handled when the sort keys are generated. No other code has to care at all. - Key uniqueness (-u) is done during merge for large datasets. It only has to be done when writing the output file for small files. Since the file is in key order this is simple! Probably fixes all of: PR/27257 PR/25551 PR/22182 PR/31095 PR/30504 PR/36816 PR/37860 PR/39308 Also PR/18614 should no longer die, but a little more work needs to be done on the merging for very large files.
|
| 1.22 | 20-Aug-2009 |
dsl | Delete more unwanted/unused cruft. Simplify logic for reading input records. Do a merge sort whenever we have 16 partial sorted blocks. The patient is breathing, but still carrying a lot of extra weight.
|
| 1.21 | 16-Aug-2009 |
dsl | Replace all uses of sizeof(TRECHEADER) with REC_DATA_OFFSET - which is defined as offsetof(RECHEADER, data). Delete TRECHEADER.
|
| 1.20 | 15-Aug-2009 |
dsl | linebuf and linebuf_size are only used inside seq() - which also not only has its own static variable, but will also extend the buffer. Remove linebuf/size and change seq() to use a private, locally managed buffer.
|
| 1.19 | 15-Aug-2009 |
dsl | Ansify. I'm looking at fixing the 'sort -n' fubars, but this code is an inpeneterable mess - which needs some fixing first!
|
| 1.18 | 28-Apr-2008 |
martin | branches: 1.18.6; 1.18.12; Remove clause 3 and 4 from TNF licenses
|
| 1.17 | 17-Feb-2004 |
jdolecek | branches: 1.17.32; initialize malloc()ated memory
|
| 1.16 | 18-Oct-2003 |
itojun | KNF (mostly whitespace)
|
| 1.15 | 16-Oct-2003 |
itojun | safer use of realloc
|
| 1.14 | 07-Aug-2003 |
jdolecek | add TNF copyright
|
| 1.13 | 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
| 1.12 | 20-Mar-2003 |
jdolecek | get rid of one memmove() (not very significant) remove ()'s from error messages move some error checks immediatelly after appropriate realloc() calls
|
| 1.11 | 25-Dec-2002 |
jdolecek | make function merge() static in msort.c cosmetic change to how local variable is incremented (moved to for(;;))
|
| 1.10 | 19-Feb-2001 |
jdolecek | Pull up various cosmetic (mostly whitespace) changes from OpenBSD. This is primarily to ease syncing the two versions.
|
| 1.9 | 19-Jan-2001 |
jdolecek | merge(): use array of buffers instead of one big buffer for all records, and enlarge them as necessary to read records from merged files; the buffers are allocated once per program run, so there shouldn't be any performance difference This makes sort(1) pass also regression 40B and should make it fully arbitrary long record capable. XXX the buffer array could probably be freed on end of fmerge() to save memory
|
| 1.8 | 13-Jan-2001 |
jdolecek | when merging stuff from several files, make merge handle records correctly for stable sort so that the records are not swapped arbitrarily - this makes in-tree BSD sort(1) pass regression test 38
while here, do couple of cleanups, like s/16/MERGE_FNUM/ where appropriate, making local stuff static and some intendation/code format changes
|
| 1.7 | 11-Jan-2001 |
jdolecek | general cleanup of file list passing: * get rid of union f_handle, replace by passing explicit int parameter and (new) struct filelist * add new typedefs gen_func_t and put_func_t and use where appropriate
|
| 1.6 | 17-Oct-2000 |
jdolecek | order(): since getline()/getnext() behaviour wrt passed end pointer has changed (full buffer is used instead of first DEFLLEN bytes) the end pointer cannot be shared for crec and prec, we need to pass different value in each case
|
| 1.5 | 16-Oct-2000 |
jdolecek | constify, rename MAXLLEN to DEFLLEN
|
| 1.4 | 15-Oct-2000 |
jdolecek | don't use register declarations
|
| 1.3 | 07-Oct-2000 |
bjh21 | Two classes of changes from the initial OpenBSD commit of this sort(1): FILE * variables are called "fp" rather than "fd". Better (safer) temporary-file handling.
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.17.32.1 | 18-May-2008 |
yamt | sync with head.
|
| 1.18.12.2 | 20-May-2011 |
matt | bring matt-nb5-mips64 up to date with netbsd-5-1-RELEASE
|
| 1.18.12.1 | 21-Apr-2010 |
matt | sync to netbsd-5
|
| 1.18.6.2 | 29-Jun-2010 |
riz | Pull up following revision(s) (requested by dholland in ticket #1420): usr.bin/sort/sort.h: revision 1.31 usr.bin/sort/sort.c: revision 1.58 usr.bin/sort/fsort.c: revision 1.47 usr.bin/sort/msort.c: revision 1.30 Don't touch past the end of allocated region. It results segmentation violation.
|
| 1.18.6.1 | 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|
| 1.6 | 28-Apr-2008 |
martin | Remove clause 3 and 4 from TNF licenses
|
| 1.5 | 07-Aug-2003 |
jdolecek | branches: 1.5.32; add TNF copyright
|
| 1.4 | 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
| 1.3 | 07-Oct-2000 |
bjh21 | Two classes of changes from the initial OpenBSD commit of this sort(1): FILE * variables are called "fp" rather than "fd". Better (safer) temporary-file handling.
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.5.32.1 | 18-May-2008 |
yamt | sync with head.
|
| 1.4 | 19-Sep-2009 |
dsl | branches: 1.4.2; 1.4.4; Fix sort -u, PR/42094
|
| 1.3 | 10-Sep-2009 |
dsl | Save length of key instead of relying of the weight of the record sep. This frees a byte value to use for 'end of key' (to correctly sort short keys) while still having a weight assigned to the field sep. (Unless -t is given, the field sep is in the field data.) Do reverse sorts by writing the output file in reverse order (rather than reversing the sort - apart from merges). All key compares are now unweighted. For 'sort -u' mark duplicates keys during the sort and don't write to the output. Use -S to mean a posix sort - where equal keys are sorted using the raw record (rather than being kept in the original order). For 'sort -f' (no keys) generate a key of the folded data (as for -n -i and -d), simplifies the code and allows a 'posix' sort.
|
| 1.2 | 05-Sep-2009 |
dsl | Now we have our own radix_sort() change the interface so that we pass an array of 'RECHEADER *' and remove all the crappy stuff that backed up by REC_DATA_OFFSET (etc). Also change radix_sort() to return the number of elements, soon to be used to drop duplicate keys (for sort -u).
|
| 1.1 | 05-Sep-2009 |
dsl | Include a local copy of the sradixsort() code from libc. Currently unchanged apart from the deletion of the 'unstable' version and other unneeded code. Use fldtab[0]. not fldtab-> when we are referring to the global info in the 0th entry to emphasise that this entry is different. fldtab[0].weights is only needed in the SINGL_FLD case - so set it there. Re-indent a big 'if' is setfield() so that the line breaks match the logic - which looks dubious now!
|
| 1.4.4.2 | 21-Apr-2010 |
matt | sync to netbsd-5
|
| 1.4.4.1 | 19-Sep-2009 |
matt | file radix_sort.c was added on branch matt-nb5-mips64 on 2010-04-21 05:27:12 +0000
|
| 1.4.2.2 | 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|
| 1.4.2.1 | 19-Sep-2009 |
sborrill | file radix_sort.c was added on branch netbsd-5 on 2009-10-14 20:41:53 +0000
|
| 1.41 | 17-Feb-2025 |
wiz | capitalize POSIX
|
| 1.40 | 01-Sep-2019 |
sevan | branches: 1.40.10; sort was there since v1 https://www.bell-labs.com/usr/dmr/www/man61.pdf
|
| 1.39 | 11-Jul-2019 |
msaitoh | branches: 1.39.2; Fix typo (s/supress/suppress/).
|
| 1.38 | 03-Jul-2017 |
wiz | branches: 1.38.6; Remove workaround for ancient HTML generation code.
|
| 1.37 | 21-Dec-2016 |
abhinav | Add missing full stop.
|
| 1.36 | 01-Jun-2016 |
wiz | branches: 1.36.2; Sort options and their descriptions. Sync usage more with man page. Bump date in man page for new option -C.
|
| 1.35 | 01-Jun-2016 |
kre | Add the posix -C option (-c but quieter). Fix -R to work properly when setting \n as the record delimited using a numeric value rather than literal \n - and to not incorrectly turn \n into a field separator if -R is used to make some other char the record separator (\n becomes a field separator in that case as long as the field separator remains "white space" but should not be in any other case - unless set explicitly of course.)
Plus more cosmetic changes - the man page and usage are updated to make it more clear that the 2 (or 1) params to -k are not fields (field1 and field2) but specifiers of the beginning and end of one key field. There was an unused 'x' option in the GETOPTS string. The usage message is reformatted to display properly on both 80 col and > 80 col displays (on < 80 it will still probably look pretty ugly ... perhaps not quite so bad though), and is also updated to show the different usage for the -c case (and -C) from the others (only 1 file permitted) - the man page synopsis has a similar update.
Using more than one of -c -C or -m generates a usage message rather than just ignoring the -m as it did before (there was no -C before of course).
Aside from the bug fix to the interaction between -R and -t, there are no changes that affect the way anything is sorted (or read, or written).
Discussed on tech-userlevel earlier this week.
|
| 1.34 | 29-May-2013 |
wiz | - Remove redundant argument to non-first `.Nm' macro; - reference `-u' at `-c', to make more clear that the former can be used with the latter; - bump date.
From Bug Hunting.
While here, use Aq.
|
| 1.33 | 20-Jan-2013 |
apb | As from today, numeric fields may begin with an optional plus or minus sign, not only an optional minus sign.
|
| 1.32 | 18-Dec-2010 |
wiz | branches: 1.32.6; 1.32.12; Sort sections.
|
| 1.31 | 18-Dec-2010 |
christos | Add an 'l' style for sorting that sorts by the string length of the field.
|
| 1.30 | 14-May-2010 |
jruoho | RETURN VALUES -> EXIT STATUS.
|
| 1.29 | 23-Aug-2009 |
wiz | Fix pasto.
|
| 1.28 | 22-Aug-2009 |
dsl | Bring nearer to reality. Note that -H is now ignored. Move -S and -s (and -H) to the first list of options since they are global ones, not ones that override the ordering rules.
|
| 1.27 | 11-Mar-2009 |
joerg | Don't workaround ancient macro argument limit with .Xo/.Xc.
|
| 1.26 | 02-May-2008 |
martin | branches: 1.26.6; 1.26.8; 1.26.12; Move TNF licenses to 2 clause form
|
| 1.25 | 23-Jul-2004 |
wiz | branches: 1.25.26; Sort options in SYNOPSIS. From Kouichirou Hiratsuka in PR 26278.
|
| 1.24 | 07-Aug-2003 |
jdolecek | add TNF copyright
|
| 1.23 | 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
| 1.22 | 27-Jun-2003 |
wiz | Pa Ar -> Ar.
|
| 1.21 | 25-Feb-2003 |
wiz | .Nm does not need a dummy argument ("") before punctuation or for correct formatting of the SYNOPSIS any longer.
|
| 1.20 | 04-Feb-2003 |
perry | "Utilize" has exactly the same meaning as "use," but it is more difficult to read and understand. Most manuals of English style therefore say that you should use "use".
|
| 1.19 | 06-Jan-2003 |
wiz | compatibility, not compatiblity.
|
| 1.18 | 08-Feb-2002 |
ross | Generate <>& symbolically. I'm avoiding .../dist/... directories for now.
|
| 1.17 | 08-Dec-2001 |
wiz | Punctuation nits, sort SEE ALSO.
|
| 1.16 | 16-Mar-2001 |
fair | Add cross references for qsort(3), and radixsort(3), per PR 10567
|
| 1.15 | 19-Feb-2001 |
jdolecek | Pull in various cosmetic changes from OpenBSD version of this manpage - mostly whitespace changes, which don't influence the layout of result manpage at all, but also add -H to SYNOPSIS and state sort(1) appeared in v5, not v6 of AT&T Unix.
|
| 1.14 | 19-Feb-2001 |
jdolecek | document -T and TMPDIR handling resurrect ENVIRONMENT and FILES, adjust to be more correct slighly adjust SYNOPSIS line, so that it looks little nicer :)
|
| 1.13 | 07-Feb-2001 |
jdolecek | move sections so that the order is more like the one specified by mdoc.samples(7)
|
| 1.12 | 07-Feb-2001 |
jdolecek | use -R instead -w, to be compatible with OpenBSD
|
| 1.11 | 07-Feb-2001 |
jdolecek | s/-T/-w/
|
| 1.10 | 16-Jan-2001 |
jdolecek | set date to when this utility became default system sort(1) on NetBSD add information about when it came to NetBSD to HISTORY
|
| 1.9 | 13-Jan-2001 |
jdolecek | note this sort(1) implementation appeared in 4.4BSD
|
| 1.8 | 13-Jan-2001 |
jdolecek | add -s/-S to synopsis remove TMPDIR stuff - it no longer applies, at least for now move the note about link/unlink from BUGS to NOTES add note about trailing record separator and lack of restriction on line length or allowed bytes
|
| 1.7 | 08-Jan-2001 |
jdolecek | by default, use stable sort add -S flag to switch to non-stable sort; for GNU sort compatibility, provide -s flag too
|
| 1.6 | 07-Nov-2000 |
lukem | fix up various .Nm abuses: - keep the case consistent between the actual name and what's referenced. e.g, if it's `foo', don't use '.Nm Foo' at the start of a sentence. - remove unnecessary `.Nm foo' after the first occurrence (except for using `.Nm ""' if there's stuff following, or for the 2nd and so on occurrences in a SYNOPSIS - use Sx, Ic, Li, Em, Sq, and Xr as appropriate
|
| 1.5 | 16-Oct-2000 |
jdolecek | enlarge line buffer as necessary, so that it's possible to process lines longer than 65522 characters constify, rename MAXLLEN to DEFLLEN
|
| 1.4 | 14-Oct-2000 |
bjh21 | HEAVY formatting cleanup.
|
| 1.3 | 07-Oct-2000 |
bjh21 | Two classes of changes from the initial OpenBSD commit of this sort(1): FILE * variables are called "fp" rather than "fd". Better (safer) temporary-file handling.
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.25.26.1 | 18-May-2008 |
yamt | sync with head.
|
| 1.26.12.1 | 21-Apr-2010 |
matt | sync to netbsd-5
|
| 1.26.8.1 | 13-May-2009 |
jym | Sync with HEAD.
Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
|
| 1.26.6.1 | 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|
| 1.32.12.2 | 23-Jun-2013 |
tls | resync from head
|
| 1.32.12.1 | 25-Feb-2013 |
tls | resync with head
|
| 1.32.6.2 | 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
| 1.32.6.1 | 23-Jan-2013 |
yamt | sync with head
|
| 1.36.2.1 | 07-Jan-2017 |
pgoyette | Sync with HEAD. (Note that most of these changes are simply $NetBSD$ tag issues.)
|
| 1.38.6.1 | 13-Apr-2020 |
martin | Mostly merge changes from HEAD upto 20200411
|
| 1.39.2.1 | 05-Sep-2019 |
martin | Pull up following revision(s) (requested by sevan in ticket #174): lib/libc/sys/chmod.2: revision 1.48 lib/libc/sys/stat.2: revision 1.59 lib/libc/sys/unlink.2: revision 1.30 lib/libc/sys/lseek.2: revision 1.25 lib/libc/sys/getuid.2: revision 1.18 lib/libc/sys/chown.2: revision 1.37 lib/libm/man/exp.3: revision 1.32 lib/libm/man/log.3: revision 1.7 lib/libc/sys/open.2: revision 1.60 lib/libc/stdio/fopen.3: revision 1.36 lib/libc/stdio/putc.3: revision 1.14 lib/libc/sys/mount.2: revision 1.51 share/man/man9/copy.9: revision 1.22 share/man/man9/uiomove.9: revision 1.20 lib/libc/sys/setuid.2: revision 1.23 lib/libc/sys/close.2: revision 1.18 sbin/init/init.8: revision 1.61 lib/libc/sys/write.2: revision 1.36 lib/libc/sys/read.2: revision 1.39 sbin/init/init.8: revision 1.62 lib/libc/sys/wait.2: revision 1.40 usr.bin/tty/tty.1: revision 1.10 lib/libc/sys/link.2: revision 1.33 usr.bin/du/du.1: revision 1.24 lib/libc/stdlib/exit.3: revision 1.17 usr.bin/su/su.1: revision 1.53 usr.bin/mail/mail.1: revision 1.66 lib/libc/sys/fork.2: revision 1.25 usr.bin/su/su.1: revision 1.54 usr.bin/mail/mail.1: revision 1.67 lib/libm/man/sin.3: revision 1.15 share/man/man9/intro.9: revision 1.26 share/man/man5/utmp.5: revision 1.17 lib/libc/compat-43/creat.3: revision 1.17 lib/libc/time/ctime.3: revision 1.61 lib/libcompat/4.1/stty.3: revision 1.10 usr.bin/dc/dc.1: revision 1.3 lib/libm/man/cos.3: revision 1.17 lib/libc/sys/chdir.2: revision 1.23 lib/libc/gen/exec.3: revision 1.30 lib/libc/gen/exec.3: revision 1.31 games/bcd/bcd.6: revision 1.18 games/bcd/bcd.6: revision 1.19 usr.bin/write/write.1: revision 1.7 usr.bin/wc/wc.1: revision 1.18 usr.bin/pr/pr.1: revision 1.24 usr.bin/who/who.1: revision 1.25 lib/libc/sys/mkdir.2: revision 1.30 lib/libc/stdio/getc.3: revision 1.13 usr.bin/sort/sort.1: revision 1.40 usr.bin/mesg/mesg.1: revision 1.11 share/man/man5/passwd.5: revision 1.34 sort was there since v1 https://www.bell-labs.com/usr/dmr/www/man61.pdf
dc was in v1 https://www.bell-labs.com/usr/dmr/www/man12.pdf
du was in v1 https://www.bell-labs.com/usr/dmr/www/man12.pdf
mail was in v1 https://www.bell-labs.com/usr/dmr/www/man12.pdf
mesg was in v1 https://www.bell-labs.com/usr/dmr/www/man12.pdf
Document history https://www.bell-labs.com/usr/dmr/www/man13.pdf
su was in v1 https://www.bell-labs.com/usr/dmr/www/man13.pdf
Document history https://www.bell-labs.com/usr/dmr/www/man13.pdf
Document history https://www.bell-labs.com/usr/dmr/www/man14.pdf Update URL
write was in v1 https://www.bell-labs.com/usr/dmr/www/man14.pdf grammar
passwd(5) was in v1 https://www.bell-labs.com/usr/dmr/www/man51.pdf
utmp(5) was present in v1 https://www.bell-labs.com/usr/dmr/www/man51.pdf
Earliest version of wtmp I could find was in v3 https://minnie.tuhs.org/cgi-bin/utree.pl?file=V3/man/man5/wtmp.5
Document history of chdir(2) https://www.bell-labs.com/usr/dmr/www/man21.pdf
Document history of chmod(2) https://www.bell-labs.com/usr/dmr/www/man21.pdf
Document history of chown(2) https://www.bell-labs.com/usr/dmr/www/man21.pdf
Document history https://www.bell-labs.com/usr/dmr/www/man21.pdf
create was present in v1 https://www.bell-labs.com/usr/dmr/www/man21.pdf
Document history of exec() Move statement on execlpe() & execvpe() to HISTORY section.
Document history https://www.bell-labs.com/usr/dmr/www/man21.pdf
fork was present in v1 https://www.bell-labs.com/usr/dmr/www/man21.pdf stat() was present in v1 https://www.bell-labs.com/usr/dmr/www/man22.pdf
document history of fstat() https://www.bell-labs.com/usr/dmr/www/man21.pdf
getuid was present in v1 https://www.bell-labs.com/usr/dmr/www/man21.pdf
Document history https://www.bell-labs.com/usr/dmr/www/man21.pdf
Document history https://www.bell-labs.com/usr/dmr/www/man21.pdf
stty & gtty were around since v1 https://www.bell-labs.com/usr/dmr/www/man21.pdf https://www.bell-labs.com/usr/dmr/www/man22.pdf
mount & umount were present in v1 https://www.bell-labs.com/usr/dmr/www/man22.pdf
Open was present in v1 https://www.bell-labs.com/usr/dmr/www/man22.pdf
read was present in v1 https://www.bell-labs.com/usr/dmr/www/man22.pdf
seek was present in v1 https://www.bell-labs.com/usr/dmr/www/man22.pdf
setuid was in v1 https://www.bell-labs.com/usr/dmr/www/man22.pdf
unlink was presen in v1 https://www.bell-labs.com/usr/dmr/www/man22.pdf
wait was present in v1 https://www.bell-labs.com/usr/dmr/www/man22.pdf
write was present in v1 https://www.bell-labs.com/usr/dmr/www/man22.pdf
start documenting history exp was present in v1 https://www.bell-labs.com/usr/dmr/www/man31.pdf
Start documenting history https://www.bell-labs.com/usr/dmr/www/man31.pdf
Start documenting history https://www.bell-labs.com/usr/dmr/www/man31.pdf
log appeared in v1 https://www.bell-labs.com/usr/dmr/www/man31.pdf
putc & putw were in v1 https://www.bell-labs.com/usr/dmr/www/man31.pdf
putchar was in v4 https://minie.tuhs.org/cgi-bin/utree.pl?file=V4/man/man3/putchr.3
Start documenting history https://www.bell-labs.com/usr/dmr/www/man31.pdf
Document history. https://www.bell-labs.com/usr/dmr/www/man11.pdf Between v1 & v6 UNIX, bcd was rewritten in C, but I don't know if which version, hence I've skipped mentioning it. End sentence with a dot. Remove superfluous Pp. Remove superfluous Pp. Remove superfluous Ns. Remove superfluous Pp. fetch(9) -> ufetch(9) fetch(9) -> ufetch(9). Remove superfluous Pp. fetch(9) -> ufetch(9). Remove reference to unimplemented ppi(9).
|
| 1.40.10.1 | 02-Aug-2025 |
perseant | Sync with HEAD
|
| 1.64 | 10-Jan-2017 |
christos | refactor includes, add <sys/stat.h>
|
| 1.63 | 01-Jun-2016 |
wiz | branches: 1.63.2; Sort options and their descriptions. Sync usage more with man page. Bump date in man page for new option -C.
|
| 1.62 | 01-Jun-2016 |
kre | Add the posix -C option (-c but quieter). Fix -R to work properly when setting \n as the record delimited using a numeric value rather than literal \n - and to not incorrectly turn \n into a field separator if -R is used to make some other char the record separator (\n becomes a field separator in that case as long as the field separator remains "white space" but should not be in any other case - unless set explicitly of course.)
Plus more cosmetic changes - the man page and usage are updated to make it more clear that the 2 (or 1) params to -k are not fields (field1 and field2) but specifiers of the beginning and end of one key field. There was an unused 'x' option in the GETOPTS string. The usage message is reformatted to display properly on both 80 col and > 80 col displays (on < 80 it will still probably look pretty ugly ... perhaps not quite so bad though), and is also updated to show the different usage for the -c case (and -C) from the others (only 1 file permitted) - the man page synopsis has a similar update.
Using more than one of -c -C or -m generates a usage message rather than just ignoring the -m as it did before (there was no -C before of course).
Aside from the bug fix to the interaction between -R and -t, there are no changes that affect the way anything is sorted (or read, or written).
Discussed on tech-userlevel earlier this week.
|
| 1.61 | 16-Sep-2011 |
joerg | Use __dead
|
| 1.60 | 18-Dec-2010 |
christos | Add an 'l' style for sorting that sorts by the string length of the field.
|
| 1.59 | 05-Jun-2010 |
dholland | fixit() needs to know the getopt options list to do its thing correctly.
|
| 1.58 | 05-Feb-2010 |
enami | Don't touch past the end of allocated region. It results segmentation violation.
|
| 1.57 | 06-Nov-2009 |
joerg | Retire __SCCSID. It has only archeological value now. Also retire lint conditional around __RCSID, lint can handle that fine.
|
| 1.56 | 26-Sep-2009 |
dsl | Move all the fopen() calls out of the record read routines into the callers. Split the merge sort so that fsort() can pass the 'FILE *' of the temporary files to be merged into the merge code. Don't rely on realloc() not moving the end address of a buffer! Rework merge sort so that it sorts pointers to 'struct mfile' and only copies about sort record descriptors. No functional change intended.
|
| 1.55 | 10-Sep-2009 |
dsl | Save length of key instead of relying of the weight of the record sep. This frees a byte value to use for 'end of key' (to correctly sort short keys) while still having a weight assigned to the field sep. (Unless -t is given, the field sep is in the field data.) Do reverse sorts by writing the output file in reverse order (rather than reversing the sort - apart from merges). All key compares are now unweighted. For 'sort -u' mark duplicates keys during the sort and don't write to the output. Use -S to mean a posix sort - where equal keys are sorted using the raw record (rather than being kept in the original order). For 'sort -f' (no keys) generate a key of the folded data (as for -n -i and -d), simplifies the code and allows a 'posix' sort.
|
| 1.54 | 05-Sep-2009 |
dsl | Include a local copy of the sradixsort() code from libc. Currently unchanged apart from the deletion of the 'unstable' version and other unneeded code. Use fldtab[0]. not fldtab-> when we are referring to the global info in the 0th entry to emphasise that this entry is different. fldtab[0].weights is only needed in the SINGL_FLD case - so set it there. Re-indent a big 'if' is setfield() so that the line breaks match the logic - which looks dubious now!
|
| 1.53 | 22-Aug-2009 |
dsl | Put radixsort() and sradixsort() the correct way around.
|
| 1.52 | 22-Aug-2009 |
dsl | Rework the way sort generates sort keys: - If we generate a key, it is always sortable using memcmp() - If we are sorting the whole record, then a weight-table must be used during compares. - Major surgery to encoding of numbers to ensure unique keys for equal numeric values. Reverse numerics are handled by inverting the sign. - Case folding (-f) is handled when the sort keys are generated. No other code has to care at all. - Key uniqueness (-u) is done during merge for large datasets. It only has to be done when writing the output file for small files. Since the file is in key order this is simple! Probably fixes all of: PR/27257 PR/25551 PR/22182 PR/31095 PR/30504 PR/36816 PR/37860 PR/39308 Also PR/18614 should no longer die, but a little more work needs to be done on the merging for very large files.
|
| 1.51 | 20-Aug-2009 |
dsl | Delete more unwanted/unused cruft. Simplify logic for reading input records. Do a merge sort whenever we have 16 partial sorted blocks. The patient is breathing, but still carrying a lot of extra weight.
|
| 1.50 | 18-Aug-2009 |
dsl | The code that attempted to sort large files by sorting each chunk by the first key byte and writing to a temp file, then sorting the records from each temp file that had the same first key byte (and repeating for upto 4 key bytes) was a nice idea, but completely doomed to failure. Eg PR/9308 where a 70MB file has all but one record the same and short keys. Not only does the code not work, it is rather guaranteed to be slow. Instead always use a merge sort for fully sorted chunk of records (each temporary file contains one lot of sorted records). The -H option already did this, so just rip out all the code and variables that can't be used when -H was specified. Further cleanup to come ...
|
| 1.49 | 15-Aug-2009 |
dsl | Ansify. I'm looking at fixing the 'sort -n' fubars, but this code is an inpeneterable mess - which needs some fixing first!
|
| 1.48 | 13-Apr-2009 |
lukem | Fix WARNS=4 issues (-Wcast-qual -Wsign-compare)
|
| 1.47 | 08-Nov-2008 |
christos | branches: 1.47.2; Make -R accept numeric arguments so one can say -R '\0' to be used in pipelines like find . -print0 | sort -R '\0'. From Anon Ymous
|
| 1.46 | 21-Jul-2008 |
lukem | branches: 1.46.4; 1.46.8; Remove the \n and tabs from the __COPYRIGHT() strings. Tweak to use a consistent format.
|
| 1.45 | 28-Apr-2008 |
martin | branches: 1.45.2; Remove clause 3 and 4 from TNF licenses
|
| 1.44 | 23-Oct-2006 |
jdolecek | branches: 1.44.16; when using -o into file which already exists, copy the permissions of the original file to the new (sorted) file
adresses PR bin/26860 by Michael van Elst
|
| 1.43 | 23-Oct-2006 |
jdolecek | replace access(2) + /dev/ prefix check with lstat(2) and S_ISCHR()/S_ISBLK()
part of PR bin/26860 by Michael van Elst
while here, put output file fopen() inside the code block of the only code path where it's actually needed, to make the logic more obvious; and in the "stdout" case, initialize toutpath to empty string rather then /dev/stdout, to make it clear /dev/stdout is not actually used
|
| 1.42 | 23-Oct-2006 |
jdolecek | use F_OK instead of 0 for second parameter of access(2)
part of PR bin/26860 by Michael van Elst
|
| 1.41 | 23-Jul-2004 |
wiz | Sync usage with man page. From Kouichirou Hiratsuka in PR 26278.
|
| 1.40 | 14-Mar-2004 |
heas | remove double initialisation of SINGL_FLD & SEP_FLAG
|
| 1.39 | 17-Feb-2004 |
jdolecek | ftpos pointer was not updated when fldtab was reallocated; drop completely in favour of an index counter fixes bin/24449 by Jun-ichiro itojun
|
| 1.38 | 17-Feb-2004 |
jdolecek | fldtab[] needs to have one extra element - this marks end of array adresses part of PR bin/24449 by Jun-ichiro itojun
|
| 1.37 | 17-Feb-2004 |
itojun | use safer realloc idiom memset new region got by realloc
|
| 1.36 | 17-Feb-2004 |
itojun | initialize fldtab
|
| 1.35 | 15-Feb-2004 |
jdolecek | remove compile-time limit on number of -k options, allocate necessary structures as-needed
|
| 1.34 | 07-Aug-2003 |
jdolecek | add TNF copyright
|
| 1.33 | 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
| 1.32 | 24-Dec-2002 |
jdolecek | g/c many_files(), too
|
| 1.31 | 24-Dec-2002 |
jdolecek | bump 'soft' limit for number of files to hard limit on startup; we want to be able to open as many temporary files as possible
|
| 1.30 | 24-Dec-2002 |
jdolecek | move fltab outside main and make it static, eliminate two memset()s g/c superfluous extern definition for clist[] and ncols make toutpath[] static
|
| 1.29 | 27-Nov-2002 |
tron | Remove the statically initialized "sigaction" structure completely because such usage is broken. Problem pointed out by Klaus Klein on "sources-changes@netbsd.org".
|
| 1.28 | 27-Nov-2002 |
tron | Add braces in a statically initialized "sigaction" structure to fix a build problem after siginfo(2) has been added.
|
| 1.27 | 14-May-2001 |
jdolecek | disable the code which maxes nofiles limit, it should not be normally needed now
|
| 1.26 | 30-Apr-2001 |
ross | XXX For some reason this program wants to open _hundreds_ of temporary files. Make it setrlimit(RLIMIT_NOFILE, ...), so this rather dubious strategy at least works well enough to ctag(1) our own kernel. XXX
|
| 1.25 | 22-Feb-2001 |
christos | - use MAXPATHLEN (1024) instead of _POSIX_PATH_MAX (255) for the temporary path buffer - provide better error messages about why the temp file creation is failing - explicitly compare syscall return to -1 instead of < 0 and fdopen return to NULL instead of 0.
|
| 1.24 | 21-Feb-2001 |
christos | Fix problem when using sort >> foo If no output file was specified sort fopened("/dev/stdout", "w"). This is *wrong* because "/dev/stdout" will truncate the output file, thus undoing the append effect the shell had set up. The simple fix here is to just arrange for outfp = stdout and don't play with /dev/stdout.
While I am here: - KNF - make pattern for mkstemp have 6 X's.
|
| 1.23 | 19-Feb-2001 |
jdolecek | full -T support
|
| 1.22 | 19-Feb-2001 |
jdolecek | resurrect old ftmp() - it supports alternative directory for temporary file, which is needed for -T support
|
| 1.21 | 07-Feb-2001 |
jdolecek | use -R instead of -w, since that's what OpenBSD is using and there is no reason to be different
|
| 1.20 | 07-Feb-2001 |
jdolecek | Since -T is used to select directory for temporary files in other sort implementations, we should avoid using it for something else. Use (new) flag -w for setting record delimiter, make -T noop.
|
| 1.19 | 07-Feb-2001 |
jdolecek | use errx(), not err() within section for '-t' flag
|
| 1.18 | 13-Jan-2001 |
soren | And make usage() test for NULL explicitly..
|
| 1.17 | 13-Jan-2001 |
soren | usage() expects a NULL when there is no specific error message.
|
| 1.16 | 13-Jan-2001 |
jdolecek | save couple of cycles and bytes by static initialization of sigaction act and sigtable[]
|
| 1.15 | 12-Jan-2001 |
jdolecek | alltable[], itable[], dtable[] were moved to init.c, g/c from sort.[ch] put extern declaration for gweights[] to sort.h add -s/-S to usage(), couple of formating nits
|
| 1.14 | 11-Jan-2001 |
jdolecek | the g/c in rev 1.12 was too aggressive - put back code to change file '-' to '/dev/stdin'
|
| 1.13 | 11-Jan-2001 |
jdolecek | general cleanup of file list passing: * get rid of union f_handle, replace by passing explicit int parameter and (new) struct filelist * add new typedefs gen_func_t and put_func_t and use where appropriate
|
| 1.12 | 08-Jan-2001 |
jdolecek | make ftmp() wrapper aroung tmpfile(), there is no need to reimplement it move ftmp() from tmp.c to files.c g/c no longer needed stuff
|
| 1.11 | 08-Jan-2001 |
jdolecek | call setlocale() on startup reformat the switch contents in main() a little, sort flags by alphabet where possible
|
| 1.10 | 08-Jan-2001 |
jdolecek | constify a bit, small cleanups
|
| 1.9 | 08-Jan-2001 |
jdolecek | by default, use stable sort add -S flag to switch to non-stable sort; for GNU sort compatibility, provide -s flag too
|
| 1.8 | 16-Oct-2000 |
jdolecek | include a bit more information in error messages, constify put temporary files in _PATH_TMP by default
|
| 1.7 | 11-Oct-2000 |
thorpej | Format string fixes.
|
| 1.6 | 07-Oct-2000 |
bjh21 | OpenBSD revision 1.5: Normalize treatment of -n option. Don't know why it was ever special-cased (since it was broken that way).
|
| 1.5 | 07-Oct-2000 |
bjh21 | OpenBSD revision 1.3: for implied stdin, do not corrupt argv[0]
|
| 1.4 | 07-Oct-2000 |
bjh21 | Part of OpenBSD revision 1.2: Fix err(3) usage.
|
| 1.3 | 07-Oct-2000 |
bjh21 | Two classes of changes from the initial OpenBSD commit of this sort(1): FILE * variables are called "fp" rather than "fd". Better (safer) temporary-file handling.
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.44.16.1 | 18-May-2008 |
yamt | sync with head.
|
| 1.45.2.1 | 18-Sep-2008 |
wrstuden | Sync with wrstuden-revivesa-base-2.
|
| 1.46.8.2 | 20-May-2011 |
matt | bring matt-nb5-mips64 up to date with netbsd-5-1-RELEASE (except compat).
|
| 1.46.8.1 | 21-Apr-2010 |
matt | sync to netbsd-5
|
| 1.46.4.2 | 29-Jun-2010 |
riz | Pull up following revision(s) (requested by dholland in ticket #1420): usr.bin/sort/sort.h: revision 1.31 usr.bin/sort/sort.c: revision 1.58 usr.bin/sort/fsort.c: revision 1.47 usr.bin/sort/msort.c: revision 1.30 Don't touch past the end of allocated region. It results segmentation violation.
|
| 1.46.4.1 | 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|
| 1.47.2.1 | 13-May-2009 |
jym | Sync with HEAD.
Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
|
| 1.63.2.1 | 20-Mar-2017 |
pgoyette | Sync with HEAD
|
| 1.36 | 01-Jun-2016 |
kre | Add the posix -C option (-c but quieter). Fix -R to work properly when setting \n as the record delimited using a numeric value rather than literal \n - and to not incorrectly turn \n into a field separator if -R is used to make some other char the record separator (\n becomes a field separator in that case as long as the field separator remains "white space" but should not be in any other case - unless set explicitly of course.)
Plus more cosmetic changes - the man page and usage are updated to make it more clear that the 2 (or 1) params to -k are not fields (field1 and field2) but specifiers of the beginning and end of one key field. There was an unused 'x' option in the GETOPTS string. The usage message is reformatted to display properly on both 80 col and > 80 col displays (on < 80 it will still probably look pretty ugly ... perhaps not quite so bad though), and is also updated to show the different usage for the -c case (and -C) from the others (only 1 file permitted) - the man page synopsis has a similar update.
Using more than one of -c -C or -m generates a usage message rather than just ignoring the -m as it did before (there was no -C before of course).
Aside from the bug fix to the interaction between -R and -t, there are no changes that affect the way anything is sorted (or read, or written).
Discussed on tech-userlevel earlier this week.
|
| 1.35 | 05-Aug-2015 |
mrg | add a description about what was being attempted to failed writes messages.
|
| 1.34 | 16-Sep-2011 |
joerg | Use __dead
|
| 1.33 | 18-Dec-2010 |
christos | Add an 'l' style for sorting that sorts by the string length of the field.
|
| 1.32 | 05-Jun-2010 |
dholland | fixit() needs to know the getopt options list to do its thing correctly.
|
| 1.31 | 05-Feb-2010 |
enami | Don't touch past the end of allocated region. It results segmentation violation.
|
| 1.30 | 28-Sep-2009 |
dsl | Fix borked fix for sort relying on realloc() changing the buffer end. Sorts of more than 8MB data now probably work again.
|
| 1.29 | 26-Sep-2009 |
dsl | Move all the fopen() calls out of the record read routines into the callers. Split the merge sort so that fsort() can pass the 'FILE *' of the temporary files to be merged into the merge code. Don't rely on realloc() not moving the end address of a buffer! Rework merge sort so that it sorts pointers to 'struct mfile' and only copies about sort record descriptors. No functional change intended.
|
| 1.28 | 10-Sep-2009 |
dsl | Save length of key instead of relying of the weight of the record sep. This frees a byte value to use for 'end of key' (to correctly sort short keys) while still having a weight assigned to the field sep. (Unless -t is given, the field sep is in the field data.) Do reverse sorts by writing the output file in reverse order (rather than reversing the sort - apart from merges). All key compares are now unweighted. For 'sort -u' mark duplicates keys during the sort and don't write to the output. Use -S to mean a posix sort - where equal keys are sorted using the raw record (rather than being kept in the original order). For 'sort -f' (no keys) generate a key of the folded data (as for -n -i and -d), simplifies the code and allows a 'posix' sort.
|
| 1.27 | 05-Sep-2009 |
dsl | Now we have our own radix_sort() change the interface so that we pass an array of 'RECHEADER *' and remove all the crappy stuff that backed up by REC_DATA_OFFSET (etc). Also change radix_sort() to return the number of elements, soon to be used to drop duplicate keys (for sort -u).
|
| 1.26 | 05-Sep-2009 |
dsl | Include a local copy of the sradixsort() code from libc. Currently unchanged apart from the deletion of the 'unstable' version and other unneeded code. Use fldtab[0]. not fldtab-> when we are referring to the global info in the 0th entry to emphasise that this entry is different. fldtab[0].weights is only needed in the SINGL_FLD case - so set it there. Re-indent a big 'if' is setfield() so that the line breaks match the logic - which looks dubious now!
|
| 1.25 | 22-Aug-2009 |
dsl | Rework the way sort generates sort keys: - If we generate a key, it is always sortable using memcmp() - If we are sorting the whole record, then a weight-table must be used during compares. - Major surgery to encoding of numbers to ensure unique keys for equal numeric values. Reverse numerics are handled by inverting the sign. - Case folding (-f) is handled when the sort keys are generated. No other code has to care at all. - Key uniqueness (-u) is done during merge for large datasets. It only has to be done when writing the output file for small files. Since the file is in key order this is simple! Probably fixes all of: PR/27257 PR/25551 PR/22182 PR/31095 PR/30504 PR/36816 PR/37860 PR/39308 Also PR/18614 should no longer die, but a little more work needs to be done on the merging for very large files.
|
| 1.24 | 20-Aug-2009 |
dsl | Delete more unwanted/unused cruft. Simplify logic for reading input records. Do a merge sort whenever we have 16 partial sorted blocks. The patient is breathing, but still carrying a lot of extra weight.
|
| 1.23 | 18-Aug-2009 |
dsl | The code that attempted to sort large files by sorting each chunk by the first key byte and writing to a temp file, then sorting the records from each temp file that had the same first key byte (and repeating for upto 4 key bytes) was a nice idea, but completely doomed to failure. Eg PR/9308 where a 70MB file has all but one record the same and short keys. Not only does the code not work, it is rather guaranteed to be slow. Instead always use a merge sort for fully sorted chunk of records (each temporary file contains one lot of sorted records). The -H option already did this, so just rip out all the code and variables that can't be used when -H was specified. Further cleanup to come ...
|
| 1.22 | 16-Aug-2009 |
dsl | Replace all uses of sizeof(TRECHEADER) with REC_DATA_OFFSET - which is defined as offsetof(RECHEADER, data). Delete TRECHEADER.
|
| 1.21 | 15-Aug-2009 |
dsl | Remove reference to db.h by using separate ptr+len fields for the only structure that used it. Pass end of keybuf area, not size to enterkey() - largely to remove a variable who'se use isn't obvious from the name! The structute of this code sucks.
|
| 1.20 | 13-Apr-2009 |
lukem | Fix WARNS=4 issues (-Wcast-qual -Wsign-compare)
|
| 1.19 | 28-Apr-2008 |
martin | branches: 1.19.6; 1.19.8; 1.19.12; Remove clause 3 and 4 from TNF licenses
|
| 1.18 | 15-Feb-2004 |
jdolecek | branches: 1.18.32; remove compile-time limit on number of -k options, allocate necessary structures as-needed
|
| 1.17 | 07-Aug-2003 |
jdolecek | add TNF copyright
|
| 1.16 | 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
| 1.15 | 25-Dec-2002 |
jdolecek | make function merge() static in msort.c cosmetic change to how local variable is incremented (moved to for(;;))
|
| 1.14 | 24-Dec-2002 |
jdolecek | put contents of extern.h directly to sort.h, and g/c extern.h de-__P()
|
| 1.13 | 24-Dec-2002 |
jdolecek | add extern definition for ncols and clist[] to sort.h, eliminate extra definitions in init.c and field.c g/c MAXMERGE
|
| 1.12 | 19-Feb-2001 |
jdolecek | cosmetic changes - make keylist[] static and remove extern definition in fsort.h, move macro SALIGN() from sort.h to fsort.c
|
| 1.11 | 19-Jan-2001 |
jdolecek | adjust intendation
|
| 1.10 | 16-Jan-2001 |
shin | - fix alignment problem.
|
| 1.9 | 12-Jan-2001 |
jdolecek | alltable[], itable[], dtable[] were moved to init.c, g/c from sort.[ch] put extern declaration for gweights[] to sort.h
|
| 1.8 | 11-Jan-2001 |
jdolecek | general cleanup of file list passing: * get rid of union f_handle, replace by passing explicit int parameter and (new) struct filelist * add new typedefs gen_func_t and put_func_t and use where appropriate
|
| 1.7 | 08-Jan-2001 |
jdolecek | constify a bit, small cleanups
|
| 1.6 | 08-Jan-2001 |
jdolecek | by default, use stable sort add -S flag to switch to non-stable sort; for GNU sort compatibility, provide -s flag too
|
| 1.5 | 16-Oct-2000 |
jdolecek | enlarge line buffer as necessary, so that it's possible to process lines longer than 65522 characters constify, rename MAXLLEN to DEFLLEN
|
| 1.4 | 07-Oct-2000 |
simonb | Include <string.h> to get prototype for memcpy(). Fixed compile problems on alpha (and other LP64 archs?).
XXX: Can't gcc be fixed so that it doesn't auto-prototype mem*()??
|
| 1.3 | 07-Oct-2000 |
bjh21 | Two classes of changes from the initial OpenBSD commit of this sort(1): FILE * variables are called "fp" rather than "fd". Better (safer) temporary-file handling.
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.18.32.1 | 18-May-2008 |
yamt | sync with head.
|
| 1.19.12.2 | 20-May-2011 |
matt | bring matt-nb5-mips64 up to date with netbsd-5-1-RELEASE
|
| 1.19.12.1 | 21-Apr-2010 |
matt | sync to netbsd-5
|
| 1.19.8.1 | 13-May-2009 |
jym | Sync with HEAD.
Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
|
| 1.19.6.2 | 29-Jun-2010 |
riz | Pull up following revision(s) (requested by dholland in ticket #1420): usr.bin/sort/sort.h: revision 1.31 usr.bin/sort/sort.c: revision 1.58 usr.bin/sort/fsort.c: revision 1.47 usr.bin/sort/msort.c: revision 1.30 Don't touch past the end of allocated region. It results segmentation violation.
|
| 1.19.6.1 | 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|
| 1.16 | 06-Nov-2009 |
joerg | Retire __SCCSID. It has only archeological value now. Also retire lint conditional around __RCSID, lint can handle that fine.
|
| 1.15 | 26-Sep-2009 |
dsl | Move all the fopen() calls out of the record read routines into the callers. Split the merge sort so that fsort() can pass the 'FILE *' of the temporary files to be merged into the merge code. Don't rely on realloc() not moving the end address of a buffer! Rework merge sort so that it sorts pointers to 'struct mfile' and only copies about sort record descriptors. No functional change intended.
|
| 1.14 | 15-Aug-2009 |
dsl | Ansify. I'm looking at fixing the 'sort -n' fubars, but this code is an inpeneterable mess - which needs some fixing first!
|
| 1.13 | 28-Apr-2008 |
martin | branches: 1.13.6; 1.13.12; Remove clause 3 and 4 from TNF licenses
|
| 1.12 | 21-Feb-2007 |
hubertf | branches: 1.12.10; <ctype.h> is unused. What's still needed is <sys/cdefs.h> (which is usually included at that place anyways).
From Slava Semushin <slava.semushin@gmail.com>.
|
| 1.11 | 07-Aug-2003 |
jdolecek | add TNF copyright
|
| 1.10 | 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
| 1.9 | 23-Dec-2002 |
jdolecek | simplify a bit (no need for separate 'char *path')
|
| 1.8 | 23-Feb-2001 |
jdolecek | Use MAXPATHLEN (which is 1024) instead of _POSIX_PATH_MAX (which is only 255). This change tracks change in rev 1.25 of sort.c by Christos Zoulas. While here, improve error messages slighly.
|
| 1.7 | 19-Feb-2001 |
jdolecek | resurrect old ftmp() - it supports alternative directory for temporary file, which is needed for -T support
|
| 1.6 | 08-Jan-2001 |
jdolecek | make ftmp() wrapper aroung tmpfile(), there is no need to reimplement it move ftmp() from tmp.c to files.c g/c no longer needed stuff
|
| 1.5 | 16-Oct-2000 |
jdolecek | include a bit more information in error messages
|
| 1.4 | 11-Oct-2000 |
thorpej | Format string fixes.
|
| 1.3 | 07-Oct-2000 |
bjh21 | Two classes of changes from the initial OpenBSD commit of this sort(1): FILE * variables are called "fp" rather than "fd". Better (safer) temporary-file handling.
|
| 1.2 | 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
| 1.1 | 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
| 1.1.1.1 | 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
| 1.12.10.1 | 18-May-2008 |
yamt | sync with head.
|
| 1.13.12.1 | 21-Apr-2010 |
matt | sync to netbsd-5
|
| 1.13.6.1 | 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|