Subject: Re: Possible bug relating to malloc()/realloc(), popen(), and read()
To: None <port-i386@NetBSD.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: port-i386
Date: 12/04/2004 01:22:16
>> Here is your actual problem.  [...]

> Ok.  I thought it would always ready the full block_size unless it
> ran into EOF or got an error.

No; this has never been promised for anything but plain files.  It
would be perfectly legal - though rather odd and inefficient - for a
read from a pipe to always return at most one byte.

> Apparently I cannot depend on that due to buffering issues when
> reading from a network pipe as in popen.

A "network pipe"?  Pipes and network connections are actually quite
different in recent NetBSD.  (In less recent NetBSD, they have been
only slightly different; going back before that, before NetBSD qua
NetBSD existed, there was a time when network connections didn't exist
and pipes were yet a third sort of animal, one which no longer exists.
There may even have been other sorts of pipe, I don't know.)

> To be honest, I was still unclear about the advantages/disadvantages
> of read() vs fread() at first,

The biggest difference is that one uses a file descriptor and the other
uses a stdio stream.  (Depending on what you're doing, this can be an
advantage for either one.)

In this case, since popen() returns a stdio stream, using stdio (in
this case probably fread()) is more appropriate.  I say it is probably
a theoretical problem rather than a practical one because I know the
implementation well enough to know that currently, provided you grab
the fd as you do, with fileno() before doing any stdio I/O, you can get
away with ignoring stdio.  But strictly speaking this is an accident of
the implementation.

The advantage of read is that it avoids stdio buffering, whereas the
advantage of fread is that it provides stdio buffering. :-)

If you want to transfer large amounts of data at high speed, with
little or no processing, read() is your friend.

If you want to do I/O small numbers of bytes at a time, or relatively
complicated I/O like scanf()/printf(), fread() is your friend -
basically, if the cycle cost of the buffering is small compared to the
cycle cost of the processing, stdio's flexibility is usually worth it,
especially when you factor in the added maintainability.

For what you're doing, the difference will almost certainly be swamped
by other factors such as context switch overhead.

>> Leaving aside the question of using stdio or not, I'd write that as
>> something like this:

>>  /* I consider NULL harmful;

> I am curious why you consider NULL harmful?

The lesser reason is that it tends to get confused with NUL, ASCII
character code 0, which C canonically writes '\0'.  For example, I have
actually seen code that does things like "*sp = NULL;" to terminate a
character string - and yes, I know that's broken, that's my point:
someone got NULL and NUL confused enough to write it.

The greater reason is that it isn't what it purports to be, which is, a
polymorphic nil pointer (called a "null pointer" by most references, a
term I dislike because of its spelling similarity to, and confusion
danger with, NULL).  It is a polymorphic nil pointer in some contexts,
yes, such as the RHS of an assignment statement where the LHS is of
pointer type.  But it is not a polymorphic nil pointer in exactly those
cases where a polymorphic nil pointer is most needed - those where
there is no compile-time type available for the rvalue.  The commonest
such case that comes to mind is an argument where no prototype in scope
specifies a type for that argument.  (I _think_ that's the only case,
but I'm not quite sure enough to say so outright.)

In fact, NULL is a polymorphic nil pointer in only and exactly those
cases where 0 is also a polymorphic nil pointer.  (Indeed, one of the
acceptable definitions for NULL is just that: 0.)  Other cases -
perhaps the commonest is the execl() arglist terminator - require a
cast in order to be portable to machines where int and char * are not
the same size, where integer 0 is not the same bit pattern as a nil
char *, or where integer 0 and nil char * are passed in different ways.

This leaves as the only benefit of NULL over 0 that it is documentation
to human readers that the rvalue is conceptualized as a nil pointer by
the code's author.  Given the confusion and the misuses I have seen
result, I consider this benefit to be far outweighed by the problems
that come with it.

>>     I also consider gratuitous embedded assignments harmful. */
>>  output = malloc(block_size);
>>  if (output == 0) { ...can't malloc... }
> That is an interesting comment.  It seems like coding examples almost
> always show embedded return value assignments and I often find it
> convenient in looping.  How do you consider it harmful?

As for "convenient", note that I said *gratuitous* embedded
assignments.  For example, assignments in the first or thrd part of a
for loop's header are not gratuitous - indeed, the whole point of a for
loop is to bring those two assignments and the test together, for use
when they conceptually belong together.

As a rule of thumb, if an assignment expression's value is used, I will
look at pulling it out into an "expression;" statement of its own.  I
won't always end up doing so; other considerations may override this
tendency (for example, if the lvalue assigned to is textually
complicated or expensive to compute).

> What I still am not clear about is:
>     Why does it consistently read only 1024 bytes only on the first
>     read and then always read the full specified block size on all
>     subsequent reads.

This is almost certainly an artifact of the way the pipes popen uses
are implemented - see below for more.

>     Why does it not do that on FreeBSD?  This is one of the things
>     that made me wonder if there was a bug in NetBSD.

Because FreeBSD does pipes differently.  I think FreeBSD still does
pipes as AF_LOCAL socketpairs (this is the intermediate historical
implementation I refered to above).

>     Also, why does the problem not exist when I run it through the
>     debugger (gdb).  In that case it always seems to read the full
>     specified block size, even on the first read if I step through
>     it.  If I let it run at normal speed through the first two reads,
>     it still only reads 1024 bytes on the first read.  It is
>     apparently timing related.

This does not really surprise me, though to explain it in full would
probably require delving into things I can't really look at, such as
the exact alignment of the buffers used in the output code of the
process you popen()ed.  I would guess that the first write writes only
1024 bytes, so when running at full speed it immediately
context-switches to your code, which finds only 1024 bytes waiting.  If
you stop in gdb, your process stops while interacting with you, the
writing application gets to run again, and it fills up the pipe, so
when you get around to actually doing the read - even if only a very
short time later on human timescales - the pipe is full and you get all
the bytes you ask for (since you ask for less than the pipe can hold).
(The part that would require complicated explanation is why the first
write writes only 1024 bytes - and why later writes don't, or,
alternatively, why the writer gets to write twice for one read by the
reader.)

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B