Source-Changes archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

CVS commit: src/usr.bin/sort

Module Name:    src
Committed By:   dsl
Date:           Tue Aug 18 18:00:28 UTC 2009

Modified Files:
        src/usr.bin/sort: append.c files.c fsort.c sort.c sort.h

Log Message:
The code that attempted to sort large files by sorting each chunk by the
first key byte and writing to a temp file, then sorting the records from
each temp file that had the same first key byte (and repeating for upto
4 key bytes) was a nice idea, but completely doomed to failure.
Eg PR/9308 where a 70MB file has all but one record the same and short keys.
Not only does the code not work, it is rather guaranteed to be slow.
Instead always use a merge sort for fully sorted chunk of records (each
temporary file contains one lot of sorted records).
The -H option already did this, so just rip out all the code and variables
that can't be used when -H was specified.
Further cleanup to come ...

To generate a diff of this commit:
cvs rdiff -u -r1.17 -r1.18 src/usr.bin/sort/append.c
cvs rdiff -u -r1.33 -r1.34 src/usr.bin/sort/files.c
cvs rdiff -u -r1.36 -r1.37 src/usr.bin/sort/fsort.c
cvs rdiff -u -r1.49 -r1.50 src/usr.bin/sort/sort.c
cvs rdiff -u -r1.22 -r1.23 src/usr.bin/sort/sort.h

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Home | Main Index | Thread Index | Old Index