History log of /src/usr.bin/sort/sort.c |
Revision | | Date | Author | Comments |
1.64 |
| 10-Jan-2017 |
christos | refactor includes, add <sys/stat.h>
|
1.63 |
| 01-Jun-2016 |
wiz | branches: 1.63.2; Sort options and their descriptions. Sync usage more with man page. Bump date in man page for new option -C.
|
1.62 |
| 01-Jun-2016 |
kre | Add the posix -C option (-c but quieter). Fix -R to work properly when setting \n as the record delimited using a numeric value rather than literal \n - and to not incorrectly turn \n into a field separator if -R is used to make some other char the record separator (\n becomes a field separator in that case as long as the field separator remains "white space" but should not be in any other case - unless set explicitly of course.)
Plus more cosmetic changes - the man page and usage are updated to make it more clear that the 2 (or 1) params to -k are not fields (field1 and field2) but specifiers of the beginning and end of one key field. There was an unused 'x' option in the GETOPTS string. The usage message is reformatted to display properly on both 80 col and > 80 col displays (on < 80 it will still probably look pretty ugly ... perhaps not quite so bad though), and is also updated to show the different usage for the -c case (and -C) from the others (only 1 file permitted) - the man page synopsis has a similar update.
Using more than one of -c -C or -m generates a usage message rather than just ignoring the -m as it did before (there was no -C before of course).
Aside from the bug fix to the interaction between -R and -t, there are no changes that affect the way anything is sorted (or read, or written).
Discussed on tech-userlevel earlier this week.
|
1.61 |
| 16-Sep-2011 |
joerg | Use __dead
|
1.60 |
| 18-Dec-2010 |
christos | Add an 'l' style for sorting that sorts by the string length of the field.
|
1.59 |
| 05-Jun-2010 |
dholland | fixit() needs to know the getopt options list to do its thing correctly.
|
1.58 |
| 05-Feb-2010 |
enami | Don't touch past the end of allocated region. It results segmentation violation.
|
1.57 |
| 06-Nov-2009 |
joerg | Retire __SCCSID. It has only archeological value now. Also retire lint conditional around __RCSID, lint can handle that fine.
|
1.56 |
| 26-Sep-2009 |
dsl | Move all the fopen() calls out of the record read routines into the callers. Split the merge sort so that fsort() can pass the 'FILE *' of the temporary files to be merged into the merge code. Don't rely on realloc() not moving the end address of a buffer! Rework merge sort so that it sorts pointers to 'struct mfile' and only copies about sort record descriptors. No functional change intended.
|
1.55 |
| 10-Sep-2009 |
dsl | Save length of key instead of relying of the weight of the record sep. This frees a byte value to use for 'end of key' (to correctly sort short keys) while still having a weight assigned to the field sep. (Unless -t is given, the field sep is in the field data.) Do reverse sorts by writing the output file in reverse order (rather than reversing the sort - apart from merges). All key compares are now unweighted. For 'sort -u' mark duplicates keys during the sort and don't write to the output. Use -S to mean a posix sort - where equal keys are sorted using the raw record (rather than being kept in the original order). For 'sort -f' (no keys) generate a key of the folded data (as for -n -i and -d), simplifies the code and allows a 'posix' sort.
|
1.54 |
| 05-Sep-2009 |
dsl | Include a local copy of the sradixsort() code from libc. Currently unchanged apart from the deletion of the 'unstable' version and other unneeded code. Use fldtab[0]. not fldtab-> when we are referring to the global info in the 0th entry to emphasise that this entry is different. fldtab[0].weights is only needed in the SINGL_FLD case - so set it there. Re-indent a big 'if' is setfield() so that the line breaks match the logic - which looks dubious now!
|
1.53 |
| 22-Aug-2009 |
dsl | Put radixsort() and sradixsort() the correct way around.
|
1.52 |
| 22-Aug-2009 |
dsl | Rework the way sort generates sort keys: - If we generate a key, it is always sortable using memcmp() - If we are sorting the whole record, then a weight-table must be used during compares. - Major surgery to encoding of numbers to ensure unique keys for equal numeric values. Reverse numerics are handled by inverting the sign. - Case folding (-f) is handled when the sort keys are generated. No other code has to care at all. - Key uniqueness (-u) is done during merge for large datasets. It only has to be done when writing the output file for small files. Since the file is in key order this is simple! Probably fixes all of: PR/27257 PR/25551 PR/22182 PR/31095 PR/30504 PR/36816 PR/37860 PR/39308 Also PR/18614 should no longer die, but a little more work needs to be done on the merging for very large files.
|
1.51 |
| 20-Aug-2009 |
dsl | Delete more unwanted/unused cruft. Simplify logic for reading input records. Do a merge sort whenever we have 16 partial sorted blocks. The patient is breathing, but still carrying a lot of extra weight.
|
1.50 |
| 18-Aug-2009 |
dsl | The code that attempted to sort large files by sorting each chunk by the first key byte and writing to a temp file, then sorting the records from each temp file that had the same first key byte (and repeating for upto 4 key bytes) was a nice idea, but completely doomed to failure. Eg PR/9308 where a 70MB file has all but one record the same and short keys. Not only does the code not work, it is rather guaranteed to be slow. Instead always use a merge sort for fully sorted chunk of records (each temporary file contains one lot of sorted records). The -H option already did this, so just rip out all the code and variables that can't be used when -H was specified. Further cleanup to come ...
|
1.49 |
| 15-Aug-2009 |
dsl | Ansify. I'm looking at fixing the 'sort -n' fubars, but this code is an inpeneterable mess - which needs some fixing first!
|
1.48 |
| 13-Apr-2009 |
lukem | Fix WARNS=4 issues (-Wcast-qual -Wsign-compare)
|
1.47 |
| 08-Nov-2008 |
christos | branches: 1.47.2; Make -R accept numeric arguments so one can say -R '\0' to be used in pipelines like find . -print0 | sort -R '\0'. From Anon Ymous
|
1.46 |
| 21-Jul-2008 |
lukem | branches: 1.46.4; 1.46.8; Remove the \n and tabs from the __COPYRIGHT() strings. Tweak to use a consistent format.
|
1.45 |
| 28-Apr-2008 |
martin | branches: 1.45.2; Remove clause 3 and 4 from TNF licenses
|
1.44 |
| 23-Oct-2006 |
jdolecek | branches: 1.44.16; when using -o into file which already exists, copy the permissions of the original file to the new (sorted) file
adresses PR bin/26860 by Michael van Elst
|
1.43 |
| 23-Oct-2006 |
jdolecek | replace access(2) + /dev/ prefix check with lstat(2) and S_ISCHR()/S_ISBLK()
part of PR bin/26860 by Michael van Elst
while here, put output file fopen() inside the code block of the only code path where it's actually needed, to make the logic more obvious; and in the "stdout" case, initialize toutpath to empty string rather then /dev/stdout, to make it clear /dev/stdout is not actually used
|
1.42 |
| 23-Oct-2006 |
jdolecek | use F_OK instead of 0 for second parameter of access(2)
part of PR bin/26860 by Michael van Elst
|
1.41 |
| 23-Jul-2004 |
wiz | Sync usage with man page. From Kouichirou Hiratsuka in PR 26278.
|
1.40 |
| 14-Mar-2004 |
heas | remove double initialisation of SINGL_FLD & SEP_FLAG
|
1.39 |
| 17-Feb-2004 |
jdolecek | ftpos pointer was not updated when fldtab was reallocated; drop completely in favour of an index counter fixes bin/24449 by Jun-ichiro itojun
|
1.38 |
| 17-Feb-2004 |
jdolecek | fldtab[] needs to have one extra element - this marks end of array adresses part of PR bin/24449 by Jun-ichiro itojun
|
1.37 |
| 17-Feb-2004 |
itojun | use safer realloc idiom memset new region got by realloc
|
1.36 |
| 17-Feb-2004 |
itojun | initialize fldtab
|
1.35 |
| 15-Feb-2004 |
jdolecek | remove compile-time limit on number of -k options, allocate necessary structures as-needed
|
1.34 |
| 07-Aug-2003 |
jdolecek | add TNF copyright
|
1.33 |
| 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22365, verified by myself.
|
1.32 |
| 24-Dec-2002 |
jdolecek | g/c many_files(), too
|
1.31 |
| 24-Dec-2002 |
jdolecek | bump 'soft' limit for number of files to hard limit on startup; we want to be able to open as many temporary files as possible
|
1.30 |
| 24-Dec-2002 |
jdolecek | move fltab outside main and make it static, eliminate two memset()s g/c superfluous extern definition for clist[] and ncols make toutpath[] static
|
1.29 |
| 27-Nov-2002 |
tron | Remove the statically initialized "sigaction" structure completely because such usage is broken. Problem pointed out by Klaus Klein on "sources-changes@netbsd.org".
|
1.28 |
| 27-Nov-2002 |
tron | Add braces in a statically initialized "sigaction" structure to fix a build problem after siginfo(2) has been added.
|
1.27 |
| 14-May-2001 |
jdolecek | disable the code which maxes nofiles limit, it should not be normally needed now
|
1.26 |
| 30-Apr-2001 |
ross | XXX For some reason this program wants to open _hundreds_ of temporary files. Make it setrlimit(RLIMIT_NOFILE, ...), so this rather dubious strategy at least works well enough to ctag(1) our own kernel. XXX
|
1.25 |
| 22-Feb-2001 |
christos | - use MAXPATHLEN (1024) instead of _POSIX_PATH_MAX (255) for the temporary path buffer - provide better error messages about why the temp file creation is failing - explicitly compare syscall return to -1 instead of < 0 and fdopen return to NULL instead of 0.
|
1.24 |
| 21-Feb-2001 |
christos | Fix problem when using sort >> foo If no output file was specified sort fopened("/dev/stdout", "w"). This is *wrong* because "/dev/stdout" will truncate the output file, thus undoing the append effect the shell had set up. The simple fix here is to just arrange for outfp = stdout and don't play with /dev/stdout.
While I am here: - KNF - make pattern for mkstemp have 6 X's.
|
1.23 |
| 19-Feb-2001 |
jdolecek | full -T support
|
1.22 |
| 19-Feb-2001 |
jdolecek | resurrect old ftmp() - it supports alternative directory for temporary file, which is needed for -T support
|
1.21 |
| 07-Feb-2001 |
jdolecek | use -R instead of -w, since that's what OpenBSD is using and there is no reason to be different
|
1.20 |
| 07-Feb-2001 |
jdolecek | Since -T is used to select directory for temporary files in other sort implementations, we should avoid using it for something else. Use (new) flag -w for setting record delimiter, make -T noop.
|
1.19 |
| 07-Feb-2001 |
jdolecek | use errx(), not err() within section for '-t' flag
|
1.18 |
| 13-Jan-2001 |
soren | And make usage() test for NULL explicitly..
|
1.17 |
| 13-Jan-2001 |
soren | usage() expects a NULL when there is no specific error message.
|
1.16 |
| 13-Jan-2001 |
jdolecek | save couple of cycles and bytes by static initialization of sigaction act and sigtable[]
|
1.15 |
| 12-Jan-2001 |
jdolecek | alltable[], itable[], dtable[] were moved to init.c, g/c from sort.[ch] put extern declaration for gweights[] to sort.h add -s/-S to usage(), couple of formating nits
|
1.14 |
| 11-Jan-2001 |
jdolecek | the g/c in rev 1.12 was too aggressive - put back code to change file '-' to '/dev/stdin'
|
1.13 |
| 11-Jan-2001 |
jdolecek | general cleanup of file list passing: * get rid of union f_handle, replace by passing explicit int parameter and (new) struct filelist * add new typedefs gen_func_t and put_func_t and use where appropriate
|
1.12 |
| 08-Jan-2001 |
jdolecek | make ftmp() wrapper aroung tmpfile(), there is no need to reimplement it move ftmp() from tmp.c to files.c g/c no longer needed stuff
|
1.11 |
| 08-Jan-2001 |
jdolecek | call setlocale() on startup reformat the switch contents in main() a little, sort flags by alphabet where possible
|
1.10 |
| 08-Jan-2001 |
jdolecek | constify a bit, small cleanups
|
1.9 |
| 08-Jan-2001 |
jdolecek | by default, use stable sort add -S flag to switch to non-stable sort; for GNU sort compatibility, provide -s flag too
|
1.8 |
| 16-Oct-2000 |
jdolecek | include a bit more information in error messages, constify put temporary files in _PATH_TMP by default
|
1.7 |
| 11-Oct-2000 |
thorpej | Format string fixes.
|
1.6 |
| 07-Oct-2000 |
bjh21 | OpenBSD revision 1.5: Normalize treatment of -n option. Don't know why it was ever special-cased (since it was broken that way).
|
1.5 |
| 07-Oct-2000 |
bjh21 | OpenBSD revision 1.3: for implied stdin, do not corrupt argv[0]
|
1.4 |
| 07-Oct-2000 |
bjh21 | Part of OpenBSD revision 1.2: Fix err(3) usage.
|
1.3 |
| 07-Oct-2000 |
bjh21 | Two classes of changes from the initial OpenBSD commit of this sort(1): FILE * variables are called "fp" rather than "fd". Better (safer) temporary-file handling.
|
1.2 |
| 07-Oct-2000 |
bjh21 | Hit sort(1) with a hammer till it compiles. Also add RCSIDs.
|
1.1 |
| 07-Oct-2000 |
bjh21 | branches: 1.1.1; Initial revision
|
1.1.1.1 |
| 07-Oct-2000 |
bjh21 | 4.4BSD-Lite2 contrib/sort
|
1.44.16.1 |
| 18-May-2008 |
yamt | sync with head.
|
1.45.2.1 |
| 18-Sep-2008 |
wrstuden | Sync with wrstuden-revivesa-base-2.
|
1.46.8.2 |
| 20-May-2011 |
matt | bring matt-nb5-mips64 up to date with netbsd-5-1-RELEASE (except compat).
|
1.46.8.1 |
| 21-Apr-2010 |
matt | sync to netbsd-5
|
1.46.4.2 |
| 29-Jun-2010 |
riz | Pull up following revision(s) (requested by dholland in ticket #1420): usr.bin/sort/sort.h: revision 1.31 usr.bin/sort/sort.c: revision 1.58 usr.bin/sort/fsort.c: revision 1.47 usr.bin/sort/msort.c: revision 1.30 Don't touch past the end of allocated region. It results segmentation violation.
|
1.46.4.1 |
| 14-Oct-2009 |
sborrill | Pull up the following revisions(s) (requested by dsl in ticket #1084): usr.bin/sort/Makefile: revision 1.6-1.8 usr.bin/sort/append.c: revision 1.15-1.22 usr.bin/sort/fields.c: revision 1.20-1.30 usr.bin/sort/files.c: revision 1.27-1.40 usr.bin/sort/fsort.c: revision 1.33-1.45 usr.bin/sort/fsort.h: revision 1.14-1.17 usr.bin/sort/init.c: revision 1.19-1.23 usr.bin/sort/msort.c: revision 1.19-1.28 usr.bin/sort/radix_sort.c: revision 1.1-1.4 usr.bin/sort/sort.1: revision 1.27-1.29 usr.bin/sort/sort.c: revision 1.47-1.56 usr.bin/sort/sort.h: revision 1.20-1.30 usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files. Use a local radixsort() function so we can pass record length. Avoid use of weight tables for key compares. Fix generation of keys for numbers, negate value for reverse sort. Write file in reverse-key order for 'sort -n'. 'sort -S' now does a posix sort (sort matching keys by record data). Ensure merge sort doesn't have too many temporary files open. Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816 PR#37860 PR#39308 PR#42094
|
1.47.2.1 |
| 13-May-2009 |
jym | Sync with HEAD.
Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
|
1.63.2.1 |
| 20-Mar-2017 |
pgoyette | Sync with HEAD
|