Subject: bin/13555: file(1) calls a file full of \0s ASCII text
To: None <gnats-bugs@gnats.netbsd.org>
From: Dave Huang <khym@azeotrope.org>
List: netbsd-bugs
Date: 07/26/2001 04:22:38
>Number:         13555
>Category:       bin
>Synopsis:       file(1) calls a file full of \0s ASCII text
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Jul 26 02:19:01 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator:     Dave Huang
>Release:        NetBSD-current as of July 25, 2001
>Organization:
Name: Dave Huang         |  Mammal, mammal / their names are called /
INet: khym@azeotrope.org |  they raise a paw / the bat, the cat /
FurryMUCK: Dahan         |  dolphin and dog / koala bear and hog -- TMBG
Dahan: Hani G Y+C 25 Y++ L+++ W- C++ T++ A+ E+ S++ V++ F- Q+++ P+ B+ PA+ PL++
>Environment:
	
System: NetBSD yerfable.metonymy.com 1.5W NetBSD 1.5W (YERFABLE) #171: Mon Jul 23 19:34:17 CDT 2001 khym@yerfable.metonymy.com:/usr/src.local/sys/arch/alpha/compile/YERFABLE alpha
Architecture: alpha
Machine: alpha
>Description:
	file(1) calls a file that contains nothing but NULs "ASCII
text, with no line terminators", but that seems pretty misleading to
me.
>How-To-Repeat:
yerfable /tmp> dd if=/dev/zero of=zeroes bs=1k count=20
20+0 records in
20+0 records out
20480 bytes transferred in 0.001 secs (20480000 bytes/sec)
yerfable /tmp> file zeroes
zeroes: ASCII text, with no line terminators
>Fix:
There's the following section of code in ascmagic.c:
        /* Undo the NUL-termination kindly provided by process() */

        while (nbytes > 0 && buf[nbytes - 1] == '\0')
                nbytes--;

but process() only adds a single NUL byte. Why is this looping to
remove multiple NULs? In the case of a file with nothing but NULs,
this reduces nbytes to 0, and ascmagic()/looks_ascii() ends up not
looking at the file at all. Since it didn't see any non-ASCII text
characters, it calls it ASCII text.

Maybe there's a good reason to remove multiple trailing NULs though--I
dunno. But I think it'd be good to leave one byte to look at:

--- /usr/src/usr.bin/file/ascmagic.c	Wed Dec  6 05:03:49 2000
+++ ascmagic.c	Thu Jul 26 04:11:23 2001
@@ -117,7 +117,7 @@
 
 	/* Undo the NUL-termination kindly provided by process() */
 
-	while (nbytes > 0 && buf[nbytes - 1] == '\0')
+	while (nbytes > 1 && buf[nbytes - 1] == '\0')
 		nbytes--;
 
 	/*

>Release-Note:
>Audit-Trail:
>Unformatted: