Subject: bin/20542: 'sort -n' puts 0 before -N, for all N.
To: None <gnats-bugs@gnats.netbsd.org>
From: seebs <seebs@vash.cel.plethora.net>
List: netbsd-bugs
Date: 03/01/2003 19:30:30
>Number:         20542
>Category:       bin
>Synopsis:       'sort -n' thinks 0 is before everything
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Mar 01 17:31:00 PST 2003
>Closed-Date:
>Last-Modified:
>Originator:     seebs
>Release:        NetBSD 1.6P
>Organization:
>Environment:
System: NetBSD vash.cel.plethora.net 1.6P NetBSD 1.6P (VASH) #1: Fri Feb 28 22:14:13 CST 2003 seebs@vash.cel.plethora.net:/usr/src/sys/arch/i386/compile/VASH i386
Architecture: i386
Machine: i386
>Description:
	'sort -n' puts 0 before negative numbers.  'sort -Sn' doesn't.

>How-To-Repeat:
	Run sort on input with 0's in it.

>Fix:
	This is almost certainly not exactly right, but it shows what goes
	wrong.  For a line where the key is "0", the set of characters
	used as the "key" for sorting the line (in radixsort, which will
	be using ascii sorting pretty much) is just a single \x80.  Since
	that's just one character, and the line is longer, the logic to try
	to keep radixsort from continuing past the end of the key is invoked -
	incorrectly, I think.  But I'm not sure.

	In any event, it's almost certainly wrong to smash part of the key!
	It seems better to *add* a record delimiter *AFTER* the key; that's
	what this patch does.

	I have not regression tested this against anything except the specific
	numerical data I had the problem with.


Index: fields.c
===================================================================
RCS file: /cvsroot/src/usr.bin/sort/fields.c,v
retrieving revision 1.11
diff -c -r1.11 fields.c
*** fields.c	2002/12/24 13:20:25	1.11
--- fields.c	2003/03/02 01:27:14
***************
*** 121,127 ****
  
  	keybuf->offset = keypos - keybuf->data;
  	keybuf->length = keybuf->offset + line->size;
! 	if (keybuf->length + sizeof(TRECHEADER) > size) {
  		/* line too long for buffer */
  		return (1);
  	}
--- 121,127 ----
  
  	keybuf->offset = keypos - keybuf->data;
  	keybuf->length = keybuf->offset + line->size;
! 	if (keybuf->length + sizeof(TRECHEADER) + 1 > size) {
  		/* line too long for buffer */
  		return (1);
  	}
***************
*** 132,139 ****
  	 * 2. we want stable sort and so the items should be sorted only by
  	 *    the relevant field[s]
  	 */
! 	if (UNIQUE || (stable_sort && keybuf->offset < line->size))
! 		keypos[-1] = REC_D;
  
  	memcpy(keybuf->data + keybuf->offset, line->data, line->size);
  	return (0);
--- 132,142 ----
  	 * 2. we want stable sort and so the items should be sorted only by
  	 *    the relevant field[s]
  	 */
! 	if (UNIQUE || (stable_sort && keybuf->offset < line->size)) {
! 		*keypos++ = REC_D;
! 		++keybuf->offset;
! 		++keybuf->length;
! 	}
  
  	memcpy(keybuf->data + keybuf->offset, line->data, line->size);
  	return (0);
>Release-Note:
>Audit-Trail:
>Unformatted: