Subject: bin/9974: locate(1) database broken
To: None <gnats-bugs@gnats.netbsd.org>
From: Wolfgang Rupprecht <wolfgang@wsrcc.com>
List: netbsd-bugs
Date: 04/24/2000 15:11:12
>Number:         9974
>Category:       bin
>Synopsis:       a recent change in locate broke the database
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Apr 24 15:12:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator:     Wolfgang Rupprecht
>Release:        NetBSD-current 4/26/2000
>Organization:
W S Rupprecht Computer Consulting, Fremont CA
>Environment:
System: NetBSD capsicum.wsrcc.com 1.4X NetBSD 1.4X (WSRCC) #0: Tue Apr 4 13:11:49 PDT 2000 root@capsicum.wsrcc.com:/v/src/netbsd/NetBSD-current/usr/src/sys/arch/i386/compile/WSRCC i386

>Description:
	a recent change in locate(1) broke the database

>How-To-Repeat:
	locate cat

/bin/cat
/u/cvsr
/u/cvsr
/u/cvsr
/u/cvsr
/u/cvsr
/u/cvsr
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/
/u/

Note the truncated lines (and/or other db problem) that causes the
searched-for term to not even apear on the output line.

>Fix:
	Unknown, but I suspect the locate.updatedb script broke when it changed
	from csh to sh.  In particular I suspect the change of 

  from:
    $LIBDIR/locate.bigram < $filelist | \
	    (sort -T "$TMPDIR"; echo $status >> $errs) | \
	    uniq -c | sort -T "$TMPDIR" -nr | \
	    awk '{ if (NR <= 128) print $2 }' | tr -d '\012' > $bigrams

  to:
    BIGRAMS=`$LIBDIR/locate.bigram < $FILELIST`
    $LIBDIR/locate.code $BIGRAMS < $FILELIST > $FCODES

Note the change from reading the bigrams from a file to puttting them
*all* on the command line.  What is the limit of the command line arg
length?  Can one really put 10's of megs on the command line???

>Release-Note:
>Audit-Trail:
>Unformatted: