[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
CVS commit: src/usr.bin/sort
Module Name: src
Committed By: dsl
Date: Tue Aug 18 18:00:28 UTC 2009
src/usr.bin/sort: append.c files.c fsort.c sort.c sort.h
The code that attempted to sort large files by sorting each chunk by the
first key byte and writing to a temp file, then sorting the records from
each temp file that had the same first key byte (and repeating for upto
4 key bytes) was a nice idea, but completely doomed to failure.
Eg PR/9308 where a 70MB file has all but one record the same and short keys.
Not only does the code not work, it is rather guaranteed to be slow.
Instead always use a merge sort for fully sorted chunk of records (each
temporary file contains one lot of sorted records).
The -H option already did this, so just rip out all the code and variables
that can't be used when -H was specified.
Further cleanup to come ...
To generate a diff of this commit:
cvs rdiff -u -r1.17 -r1.18 src/usr.bin/sort/append.c
cvs rdiff -u -r1.33 -r1.34 src/usr.bin/sort/files.c
cvs rdiff -u -r1.36 -r1.37 src/usr.bin/sort/fsort.c
cvs rdiff -u -r1.49 -r1.50 src/usr.bin/sort/sort.c
cvs rdiff -u -r1.22 -r1.23 src/usr.bin/sort/sort.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
Main Index |
Thread Index |