Subject: Re: wrap up of pipe(2)
To: NetBSD Userlevel Technical Discussion List <tech-userlevel@netbsd.org>
From: Greg A. Woods <woods@weird.com>
List: tech-userlevel
Date: 10/14/2001 14:05:12
[ On Sunday, October 14, 2001 at 12:42:05 (+0700), Robert Elz wrote: ]
> Subject: Re: wrap up of pipe(2) 
>
>     Date:        Thu, 11 Oct 2001 14:21:01 -0400 (EDT)
>     From:        woods@weird.com (Greg A. Woods)
>     Message-ID:  <20011011182101.22938EA@proven.weird.com>
> 
>   | I most certainly ALWAYS want the platform documentation delivered with a
>   | given release to EXACTLY describe the implementation in that very release!
> 
> That's almost impossible, and essentially useless.   Just why would anyone
> need that level of detail?

Perhaps I over-emphasised "always" and "exactly"....   :-)

Indeed without employing a technique such as literate programming it is
practically impossible to have documentation that ALWAYS, EXACTLY,
describes the implementation, but what I really intended to imply there
was a desire for the level of detail afforded by a well worded standards
document, for example -- something almost all systems documentation for
any system I've ever seen has yet to live up to.

> If it fails to describe some behaviour that you can observe, then yes,
> that's bad.  On the other hand, if it says "X might happen if Y", there's
> no problem if X doesn't actually happen.  You're not being guaranteed it
> will happen, just being told it might.

Yes of course -- and I don't mind such a warning at all, though I'd
still rather know the difference between something that might happen in
some generic system vs. what could in fact actually happen in the given
system I'm using at the moment.

> Absolutely, no problem there.   But what we're discussing here is the man
> page saying that you might get an EFAULT from a bad address passed to the
> pipe() sys call.   There's nothing wrong with that - including when you
> know that what will actually happen is that you'll get a SEGV instead
> (any use of a bad address in a program can generate a SEGV).

Strictly speaking, yes, there is a problem with this.  If I use the
source code as my documentation the I learn that I cannot ever get an
EFAULT from pipe().  So that to me smells a lot like a bug in the manual
page and I'd fully expect anyone with such attention to detail to
send-pr it!

However I certainly won't complain if the manual tells me that EFAULT is
a possible error on some systems, but _not_ on the one I'm reading that
manual on.  I don't expect such generic information, but I won't
complain if it's there.

> Pass a bad address to read(2) and you'll usually get EFAULT, but I can
> trivially change the implementation of the read syscall so a SEGV happens
> instead (I can trivially change the calling program so a SEGV happens instead).

What's important here is that if you were to make such a change that you
also change the documentation to state that the current implementation
will not return EFAULT when passed a bad address, but that a SIGSEGV
will be raised instead.  At least it's important if you need to
communicate a description of your implementation to any othe user of
your system.

> You need to be able to deal with both - the man pages should make that
> clear.

I only need to be able to deal with both if I'm writing portable code
(and that includes code that'll port to a new release of the same
system).  Yes I do generally want to write portable code, but I do not
need the system manuals to tell me how to do this.

In this particular case the argument against claiming EFAULT as possible
is, I agree, somewhat pedantic (it's more important that the manual also
mention the possiblity of SIGSEGV being raised).

>   | If that were to come to be then I would fully expect there either to be
>   | platform specific manual pages for the call in question,
> 
> No thanks.

I didn't think you'd like that alternative, at least not for a system
call, but it's clearly not without precedence in other parts of the
manual.

>   | I can't quite believe you said that.  Of course there's always the
>   | source, but the source is not the documentation!
> 
> It is the best documentation, and always will be, for those who really
> want to know the intimate details of exactly what happens today, rather
> than what the system is guaranteeing will continue to happen.

Of course -- but what about when the source clearly identifies there's a
bug in the documentation?  What then?  Which do you fix?  In an ideal
world they always exactly match.

The key phrase in your reply though is "for those who really want to know"

Not everyone does -- and not everyone can.  The documentation must be
able to stand in for the source, and should endeavour to be just as
accurate.

> The only stuff that should be documented are the parts you can rely on,
> stuff that won't just change on someone's whim.

No thanks!  That just won't do.  It's not the way things are now either!

If I cannot ever receive an EFAULT from pipe(2) on even just one
architecture then I want the manual page to clearly document that fact.

I won't complain if the manual page speculates about future
implementations (indeed I welcome such conjecture), but I don't want
such postulation to be given as a current fact.

>   As it is now, if on some
> port, the portmaster decided that passing the pipe array address into the
> sys call was a better design (for that port) than returning 2 integers,
> they can simply go ahead and make the change, and return EFAULT instead
> of generating a SEGV, with essentially no prior discussion, and certainly
> without anyone being able to object on the basis that the change would be
> breaking a defined interface - which they would be able to do if EFAULT
> weren't listed as a possible return code from pipe().

The issue of EFAULT vs. SIGSEGV is less pedantic if there's some
difference in implementation between different architectures.

> Or, let's take a different example, read(2) is defined as returning EINVAL
> if "The total length of the I/O is more than can be expressed by the ssize_t
> return value."   Which of NetBSD's current ports can ever actually return
> that error (for that reason) ??   Maybe some can, but there are certainly
> some of them where it will never happen.

Well, since ssize_t is a (signed) int, the possibility is there on at
least some platforms....

>   The man page is just fine however.

In this case the manual is worded in such a way that you don't have to
worry about it -- you can do your own check using SSIZE_MAX.

(it is troubling that EINVAL has so many potential meanings, but that
was discussed in another thread....)

All we're really quibbling over here is the exact wording of how EFAULT
and SIGSEGV/SIGBUS/etc. should be mentioned in pipe(2).  The wording
Bill committed almost two weeks ago is sufficient, but it's not complete
and continues the seeming tradition in Unix documentation of being
sometimes oblique and often more terse than strictly necessary.

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>     <woods@robohack.ca>
Planix, Inc. <woods@planix.com>;   Secrets of the Weird <woods@weird.com>