Re: performance of shell read loops

To: Joerg Sonnenberger <joerg%britannica.bec.de@localhost>
Subject: Re: performance of shell read loops
From: Robert Elz <kre%munnari.OZ.AU@localhost>
Date: Sat, 12 Mar 2016 22:17:26 +0700

    Date:        Sat, 12 Mar 2016 14:50:59 +0100
    From:        Joerg Sonnenberger <joerg%britannica.bec.de@localhost>
    Message-ID:  <20160312135059.GA27612%britannica.bec.de@localhost>

  | I'm not sure. A lot of shell processing also happens on real files.

There are three cases that could work to improve this - when input is
seekable, when it is a tty, and when analysis of the command sequence
shows nothing else can possibly read the stdin that is being read by read
(which would have handled DHolland's test case OK).

The former two tests are fairly easy, and could be done, though it would need
a little care to calculate the seek position for the seekable input case, and
make sure to always seek when required.

A "good enough" test for the third could, I think, be done by (assuming the
read command is in a while/until loop - if not we don't care that it does
byte at a time reads, and for loops always might exit prematurely) checking
that the loop only exits (only can exit - signals aside, they would always
leave an undefined state, so don't matter) when read "fails" (ether by
writing the loop as
	while read whatever
	do
	done < somewhere
(or similar using piped input) or writing it as
	while :	# or true
	do
		read whatever || break	; # or return or exit
	done
and that there are no non-builtin commands in the loop that don't have
their own stdin redirected, and no other ways to escape the loop (no
break or return triggered by any other conditio) - unless the stdin is
open only for this loop (as in the "done < somewhere" case above)

builtin commands could cooperate with whatever buffering read was doing,
so they wouldn't be an issue (not that there are any that read stdin that
I recall, other than read).

I have been idly considering whether those tests would be reasonable to
implement.

  | Actually I was wondering if there aren't more use cases for a "read
  | until you find the following sequence" system call or just something
  | specifiying a (simple) regular expression.

I was too - but I suspect that the only users that matter would want \n
as the terminator (or EOF of course) - other stuff (vi input mode that
can end on ESC) don't do enough input for it to be worth the bother.  So
I suspect that just line mode buffering (like ttys do, and like is available
on output via stdio) would probably be enough for 90% of uses, or more.
More complexity just makes it harder to use, and more likely to be full of
bugs.  "Line mode input" could be a simple fcntl flag.

kre

Follow-Ups:
- Re: performance of shell read loops
  - From: Joerg Sonnenberger
- Re: performance of shell read loops
  - From: David Laight

References:
- Re: performance of shell read loops
  - From: Joerg Sonnenberger
- performance of shell read loops
  - From: David Holland
- Re: performance of shell read loops
  - From: Robert Elz
- Re: performance of shell read loops
  - From: David Laight

Prev by Date: Re: performance of shell read loops
Next by Date: Re: Next steps for /bin/sh
Previous by Thread: Re: performance of shell read loops
Next by Thread: Re: performance of shell read loops
Indexes:

Home | Main Index | Thread Index | Old Index