tech-userlevel: Re: sh reads byte by byte

Subject: Re: sh reads byte by byte
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: David Laight <david@l8s.co.uk>
List: tech-userlevel
Date: 01/20/2007 19:45:35

On Sat, Jan 20, 2007 at 05:01:08PM +0100, Manuel Bouyer wrote:
> Hi,
> while trying to see how to speedup audit-packages as used by the bulk builds,
> I noticed (using ktrace) that something like this:
> #!/bin/sh
> 
> while read a b c; do
> echo $a $b $c
> done < /tmp/file
> 
> will read /tmp/file byte by byte. With the current pkg-vulnerabilities
> this makes 221909 syscalls. Would there be a way to have it read the
> file in a more efficient way (rewriting it in another language is not an
> option for now) ?

Not easily, the problem is that it has to leave the fd positioned to the
correct byte after each 'read' - since it might fork/exec some other
process that reads from the same fd, and will expect to get the byte
following the newline.

If the shell could determine whether the input file was seekable, it could
do a longer read and reposition afterwards.
It also might be possible to use mmap() and lseek() - assuming that anything
you can mmap is seekable.

	David

-- 
David Laight: david@l8s.co.uk