Home | History | Annotate | Download | only in sort
History log of /src/usr.bin/sort/fields.c
RevisionDateAuthorComments
 1.33  20-Jan-2013  apb When parsing numbers, allow a leading '+'.
 1.32  18-Dec-2010  christos branches: 1.32.6; 1.32.12;
Add an 'l' style for sorting that sorts by the string length of the field.
 1.31  06-Nov-2009  joerg Retire __SCCSID. It has only archeological value now. Also retire lint
conditional around __RCSID, lint can handle that fine.
 1.30  07-Oct-2009  dsl When encoding numbers, we can use all 8 bits for exponent values.
 1.29  16-Sep-2009  dsl Minor tweaks to the key generation for numeric fields.
Use 1's compliment for -ve numbers to avoid confitionals.
 1.28  10-Sep-2009  dsl Save length of key instead of relying of the weight of the record sep.
This frees a byte value to use for 'end of key' (to correctly sort
short keys) while still having a weight assigned to the field sep.
(Unless -t is given, the field sep is in the field data.)
Do reverse sorts by writing the output file in reverse order (rather
than reversing the sort - apart from merges).
All key compares are now unweighted.
For 'sort -u' mark duplicates keys during the sort and don't write
to the output.
Use -S to mean a posix sort - where equal keys are sorted using the
raw record (rather than being kept in the original order).
For 'sort -f' (no keys) generate a key of the folded data (as for -n
-i and -d), simplifies the code and allows a 'posix' sort.
 1.27  22-Aug-2009  dsl Fix generation of unmasked alpha keys.
 1.26  22-Aug-2009  dsl Only process each number digit once.
 1.25  22-Aug-2009  dsl Rework the way sort generates sort keys:
- If we generate a key, it is always sortable using memcmp()
- If we are sorting the whole record, then a weight-table must be used
during compares.
- Major surgery to encoding of numbers to ensure unique keys for equal
numeric values. Reverse numerics are handled by inverting the sign.
- Case folding (-f) is handled when the sort keys are generated. No other
code has to care at all.
- Key uniqueness (-u) is done during merge for large datasets. It only
has to be done when writing the output file for small files.
Since the file is in key order this is simple!
Probably fixes all of: PR/27257 PR/25551 PR/22182 PR/31095 PR/30504
PR/36816 PR/37860 PR/39308
Also PR/18614 should no longer die, but a little more work needs to be
done on the merging for very large files.
 1.24  20-Aug-2009  dsl Delete more unwanted/unused cruft.
Simplify logic for reading input records.
Do a merge sort whenever we have 16 partial sorted blocks.
The patient is breathing, but still carrying a lot of extra weight.
 1.23  15-Aug-2009  dsl Always add an REC_D char (usually \n) as the last sort key char - we
almost always need one.
But do ADD it, instead of overwriting the last byte of the last key since
that may be requesting the other end of the sort order.
There is no need to check for space for the line after adding the key,
but we might as well check before - just to optimise that case.
This might fix some of the sort bugs - but not the one I'm looking at!
 1.22  15-Aug-2009  dsl Remove reference to db.h by using separate ptr+len fields for the only
structure that used it.
Pass end of keybuf area, not size to enterkey() - largely to remove a
variable who'se use isn't obvious from the name!
The structute of this code sucks.
 1.21  15-Aug-2009  dsl Ansify.
I'm looking at fixing the 'sort -n' fubars, but this code is an
inpeneterable mess - which needs some fixing first!
 1.20  13-Apr-2009  lukem Fix WARNS=4 issues (-Wcast-qual -Wsign-compare)
 1.19  28-Apr-2008  martin branches: 1.19.6; 1.19.8; 1.19.12;
Remove clause 3 and 4 from TNF licenses
 1.18  14-Mar-2004  heas branches: 1.18.32;
Do not step over the edge of the buffer (check for '\0'). This just happens
to not lose on i386 because another buffer appears immediately following.
Regress tests all passed.
 1.17  15-Feb-2004  jdolecek make sure zero is recognized as regular number in number(), and thus sorted
properly with -n
fixes PR bin/20259 by Giles Lean, PR bin/20542 by Peter Seebach, and
part of PR bin/24316 by MLH
 1.16  15-Feb-2004  jdolecek fix -Wunitialized warnings
 1.15  18-Oct-2003  itojun KNF (mostly whitespace)
 1.14  07-Aug-2003  jdolecek add TNF copyright
 1.13  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22365, verified by myself.
 1.12  09-Apr-2003  jdolecek rename local macro blancmange() to SKIP_BLANKS(), to clarify what
it does and to better signal it might modify it's arguments
fixes PR bin/20546 by Peter Seebach
 1.11  24-Dec-2002  jdolecek add extern definition for ncols and clist[] to sort.h, eliminate extra
definitions in init.c and field.c
g/c MAXMERGE
 1.10  19-Feb-2001  jdolecek Pull up various cosmetic (mostly whitespace) changes from OpenBSD.
This is primarily to ease syncing the two versions.
 1.9  19-Feb-2001  jdolecek enterfield(): test the buffer size BEFORE assignment also for the other code
branch, since we might get called with tablepos == endkey for some special
input files (where an record would happen to fit exactly to the input
buffer) - BTW, this bug looks like it has been here ~forever ...

This seems to fix the sort crash for 'make british' build of ispell package,
as reported by Mark White at current-users@.
 1.8  19-Feb-2001  jdolecek enterkey():
* move the test for keybuf size before keypos[-1] assignment, "just in case"
* move the keypos assignment to improve readability
 1.7  13-Jan-2001  jdolecek also remove the clpos++ added in rev 1.4
 1.6  13-Jan-2001  jdolecek undo broken revision 1.4
 1.5  12-Jan-2001  jdolecek for stable sort, arrange so that really only relevant part of line
is used for sort - this makes sort pass regression test number 36

while here, slighly adjust code formating on couple of places
 1.4  17-Oct-2000  jdolecek cosmetic change in way one of for variables is updated
 1.3  15-Oct-2000  jdolecek don't use register declarations
 1.2  07-Oct-2000  bjh21 Hit sort(1) with a hammer till it compiles.
Also add RCSIDs.
 1.1  07-Oct-2000  bjh21 branches: 1.1.1;
Initial revision
 1.1.1.1  07-Oct-2000  bjh21 4.4BSD-Lite2 contrib/sort
 1.18.32.1  18-May-2008  yamt sync with head.
 1.19.12.1  21-Apr-2010  matt sync to netbsd-5
 1.19.8.1  13-May-2009  jym Sync with HEAD.

Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
 1.19.6.1  14-Oct-2009  sborrill Pull up the following revisions(s) (requested by dsl in ticket #1084):
usr.bin/sort/Makefile: revision 1.6-1.8
usr.bin/sort/append.c: revision 1.15-1.22
usr.bin/sort/fields.c: revision 1.20-1.30
usr.bin/sort/files.c: revision 1.27-1.40
usr.bin/sort/fsort.c: revision 1.33-1.45
usr.bin/sort/fsort.h: revision 1.14-1.17
usr.bin/sort/init.c: revision 1.19-1.23
usr.bin/sort/msort.c: revision 1.19-1.28
usr.bin/sort/radix_sort.c: revision 1.1-1.4
usr.bin/sort/sort.1: revision 1.27-1.29
usr.bin/sort/sort.c: revision 1.47-1.56
usr.bin/sort/sort.h: revision 1.20-1.30
usr.bin/sort/tmp.c: revision 1.14-1.15

Only use radix sort for in-memory sort, always merge temporary files.
Use a local radixsort() function so we can pass record length.
Avoid use of weight tables for key compares.
Fix generation of keys for numbers, negate value for reverse sort.
Write file in reverse-key order for 'sort -n'.
'sort -S' now does a posix sort (sort matching keys by record data).
Ensure merge sort doesn't have too many temporary files open.
Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816
PR#37860 PR#39308 PR#42094
 1.32.12.1  25-Feb-2013  tls resync with head
 1.32.6.1  23-Jan-2013  yamt sync with head

RSS XML Feed