Subject: Re: Bug in timeout()/untimeout() ?
To: Manuel Bouyer <bouyer@antioche.lip6.fr>
From: enami tsugutomo <enami@sm.sony.co.jp>
List: tech-kern
Date: 04/14/2000 20:07:14
Manuel Bouyer <bouyer@antioche.lip6.fr> writes:

> On Fri, Apr 14, 2000 at 11:27:51AM +0900, enami tsugutomo wrote:
> > Since there is a function call between each test of calltodo.c_next,
> > the sane compiler shouldn't assume it can be cached.  At least 1.4.2
> 
> in scsipi, xs_status was changed to 'volatile' (commit message:
> revision 1.5
> date: 1997/11/20 04:09:19;  author: thorpej;  state: Exp;  lines: +2 -2
> Declare the scsipi_xfer's "flags" member to be volatile, to force it
> to be reloaded every time it is checked.  This avoids a condition where
> it can be cached in a register in such a way that updates to the flags in
> an interrupt handler to not be noticed, which in turn causes the process
> doing the i/o to sleep forever.  Bug report and suggested fix from
> Hiroshi HORIMOTO <horimoto@cs-yuugao.cs.sist.ac.jp>, PR $4460.
> 
> because of this code:
>                 s = splbio();
>                 while ((xs->xs_status & XS_STS_DONE) == 0)
>                         tsleep(xs, PRIBIO + 1, "scsipi_cmd", 0);
>                 splx(s);
> 
> And tsleep *is* a function call. The case here is quite similar.

No.  Please read the PR carefully.

From PR #4460:

382:    switch (scsipi_command_direct(xs)) {
383:    case SUCCESSFULLY_QUEUED:
384:            if ((xs->flags & (SCSI_NOSLEEP | SCSI_POLL)) == SCSI_NOSLEEP)
385:                    return (EJUSTRETURN);
386:#ifdef DIAGNOSTIC
387:            if (xs->flags & SCSI_NOSLEEP)
388:                    panic("scsipi_execute_xs: NOSLEEP and POLL");
389:#endif
390:            s = splbio();
391:            while ((xs->flags & ITSDONE) == 0)
392:                    tsleep(xs, PRIBIO + 1, "scsipi_cmd", 0);
393:            splx(s);
394:    case COMPLETE:          /* Polling command completed ok */

The problem described in the PR is that the value resulted by the
evaluation of xs->flags at the line 384 is cached and used when the
first evaluation of xs->flags at the line 391 (and tsleep function
call occurs after it).

Note that the splbio() is not a function call on x68k/m68k
architecture.  So, a compiler thought that it is safe to reuse the
value at first time, but actually an interrupt handler may alter it
(and it isn't blocked).

> What happens with other processors and/or more agressive compiler flags ?

Of course, an insane compiler may generate an insane code:-).

enami.