Subject: Re: misc/19603
To: None <gnats-bugs@NetBSD.org>
From: None <kpneal@pobox.com>
List: netbsd-bugs
Date: 12/07/2005 22:31:18
--2oS5YaxWCcQjTEyO
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Sat, Dec 03, 2005 at 09:02:02AM +0000, Roland Illig wrote:
>  Maybe the word "slow" could be explained in the man page?
>  
>  A
>  .Sq slow
>  device is one that might block for an arbitrary amount of time.

That's not correct. 

I went looking through the mail archives and I can't find the thread
where this was discussed. It was current-users around Dec 30/31 of
2002. 

I've attached the emails I sent to current-users at the time. 

I still think it is a bad idea to document behavior that is not
part of the standard. Why encourage nonportable programming practice?
-- 
Kevin P. Neal                                http://www.pobox.com/~kpn/

"Nonbelievers found it difficult to defend their position in \ 
    the presense of a working computer." -- a DEC Jensen paper





--2oS5YaxWCcQjTEyO
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=current-users

From kpn Mon Dec 30 21:57:38 2002
Date: Mon, 30 Dec 2002 21:57:38 -0500
To: Bill Sommerfeld <sommerfeld@netbsd.org>
Cc: current-users@netbsd.org
Subject: Re: CVS commit: src/lib/libc/sys
Message-ID: <20021231025738.GB4122@neutralgood.org>
References: <20021230123900.50E7FB42C@cvs.netbsd.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20021230123900.50E7FB42C@cvs.netbsd.org>
Content-Length: 4204
Lines: 90

On Mon, Dec 30, 2002 at 02:39:00PM +0200, Bill Sommerfeld wrote:
> 
> Module Name:	src
> Committed By:	sommerfeld
> Date:		Mon Dec 30 12:39:00 UTC 2002
> 
> Modified Files:
> 	src/lib/libc/sys: read.2 write.2
> 
> Log Message:
> In EINTR description, add a crossreference to sigaction(2).
> Put reference to "slow device" back in since filesystem & disk I/O, doesn't get
> EINTR while pipes, sockets, ttys, etc., can.

Is there a definitive list of what is "slow" and what is not? For example,
is NFS slow or not? I would assume "slow" since I have a repeatable case
of dump crapping out because it gets a signal from a peer process before
a write() can actually write any data.

Would a single-speed CD-ROM count as slow or not? How about a hard disk
that has blocks the drive has trouble reading (but eventually, 10 seconds
from now, manages to read)? How about a vnd device? How about a device
that actually implements a networked device whose storage lives on
another machine? (Such devices have been described on this list in the
past year or so.)

What makes a slow device different from a non-slow? Anything? Do slow
devices allow the race condition that causes EINTR errors and non-slow
devices do not have this race condition? In an SMP world?

Is this relying on slow vs non-slow devices a portable practice? If not,
why would NetBSD want to encourage nonportable programming? If it is
portable then why doesn't 1003.1-2001 give this distinction in the read()
description which instead says this:

   [EINTR] The read operation was terminated due to the receipt of
           a signal, and no data was transferred. 

Why would NetBSD want to be so much more specific in this one particular
case than the actual standards document? 

Is there any good way for a programmer to know for sure whether or not
"slow" devices will be used as sources or destinations of data? If not
then why should the man page even bother to mention this distinction?

Why would NetBSD want to have stuff in it's documentation that encourages
the coding of broken programs? "Oh, nobody will EVER make this program
send data to/from a 'slow' device!" ... thus leading to dump failing over
NFS.

NetBSD's sigaction() man page says this:

     Restarting of pending calls is requested by setting the SA_RESTART bit in
     sa_flags.  The affected system calls include open(2), read(2), write(2),
     sendto(2), recvfrom(2), sendmsg(2) and recvmsg(2) on a communications
     channel or a slow device (such as a terminal, but not a regular file) and
     during a wait(2) or ioctl(2).  However, calls that have already committed

Ok, so what's a device? Is it a regular file? (I assume this is
addressed above.)

I can't find the part of NetBSD's sigaction() man page in the 1003.1-2001
description of sigaction(), either. I have not, I'll admit, read through
the entire doc.


Who wants to know how much code is in NetBSD that doesn't handle EINTR
from reads and writes correctly? Let's start at the top: /bin/cat, .....
Does NetBSD want to keep code in use that is provably wrong with race
conditions just because nobody has ever reported having an actual
problem before? Or would NetBSD prefer a sweep to fix all of the broken
programs? Because I'll do the sweep if NetBSD actually wants correct
code. I've already started (how do you think I found /bin/cat, anyway?).


Lastly, what is the correct way to handle EINTR from read or write calls?
Is it to use sigaction to keep it from ever happening, or is it to change
the read and write calls to just go ahead and handle EINTR correctly? How
about if the read and write calls are in a library? Use of sigaction in
a library for this purpose doesn't seem like a very safe thing to do.
FWIW, my preference is for the reads and writes to just deal with EINTR.


Is this really, REALLY so big of a deal that NetBSD MUST keep this mention
of "slow" devices in it's man pages? Really? Why?
-- 
Kevin P. Neal                                http://www.pobox.com/~kpn/
      'Concerns about "rights" and "ownership" of domains are inappropriate.  
 It is appropriate to be concerned about "responsibilities" and "service" 
 to the community.' -- RFC 1591, page 4: March 1994

From kpn Mon Dec 30 23:55:17 2002
Date: Mon, 30 Dec 2002 23:55:17 -0500
To: Bill Sommerfeld <sommerfeld@netbsd.org>
Cc: current-users@netbsd.org
Subject: Re:  CVS commit: src/lib/libc/sys
Message-ID: <20021231045517.GB8942@neutralgood.org>
References: <20021231025738.GB4122@neutralgood.org> <200212310415.gBV4F8D03510@syn.hamachi.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <200212310415.gBV4F8D03510@syn.hamachi.org>
Content-Length: 2877
Lines: 69

I've just noticed I wasn't subscribed to current-users. Bummer.  Fixed.
The list archives are missing the past few days mail. Bummer.

On Mon, Dec 30, 2002 at 11:15:08PM -0500, Bill Sommerfeld wrote:
> > What makes a slow device different from a non-slow? Anything? Do slow
> > devices allow the race condition that causes EINTR errors and non-slow
> > devices do not have this race condition? 
> 
> It boils down to whether the implementation of the descriptor (file,
> device driver, etc.,) does interruptable or uninterruptable sleeps
> when it sleeps.

And, as was discussed in the edited out part, NFS looks like a filesystem
but yet doesn't behave like a normal local filesystem. 
 
> > Is this relying on slow vs non-slow devices a portable practice? 
> 
> Historically, as you've discovered, a lot of code depends on this.

Uh, yep. I also offered to fix the NetBSD tree. Since I wasn't subscribed
to current-users and the archives aren't up-to-date, I have no idea if
anyone addressed my offer. 
 
> > Why would NetBSD want to be so much more specific in this one particular
> > case than the actual standards document? 
> > 
> > Is there any good way for a programmer to know for sure whether or not
> > "slow" devices will be used as sources or destinations of data? If not
> > then why should the man page even bother to mention this distinction?
> > 
> > Why would NetBSD want to have stuff in it's documentation that encourages
> > the coding of broken programs? 
> 
> Historicaly, there have been two aspects to man pages:
>  - a specification to users.
>  - a specification to implementors.
> 
> Because there is a lot of historic code which depends on filesystem
> I/O never returning EINTR, we should move cautiously, if at all, with
> respect to weakening this.

Oops. It's already weakened by NFS. 

Which programming practice do we want to encourage:

1) "Hmmm. Slow vs fast. Well, heck, nobody will ever use my new code
   with a slow device. I'll just be lazy and not worry about EINTR."

2) "Hmmm, write() can return EINTR. I'll handle this properly in my new
   code."

By making the slow vs fast distinction in the man page we encourage 
practice #1. Is that really the right thing to do?

> I'm inclined to tweak the text with respect to mention of "slow" -- at
> least to include it as "historic" behavior..

Then go ahead and mention it as "historic", but mention as well that 
NetBSD does not guarantee this anymore. (Or change NFS if you insist
and then hope to goodness that NFS isn't the only exception forever
and ever.)

Incidentally, did anyone happen to comment on my "what's the best fix
for the NetBSD tree" question? 
-- 
Kevin P. Neal                                http://www.pobox.com/~kpn/

"It sounded pretty good, but it's hard to tell how it will work out
in practice." -- Dennis Ritchie, ~1977, "Summary of a DEC 32-bit machine"


--2oS5YaxWCcQjTEyO--