Source-Changes-D archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: CVS commit: src/sys/dev/scsipi



On Thu, Apr 19, 2012 at 07:00:56PM +1000, matthew green wrote:
> 
> > > > If the driver's attach is called after cold (e.g. after a detach/rescan
> > > > of the pci bus), the driver's attach should be called with KERNEL_LOCK
> > > > held, or bad things may happen when interrupts are enabled for this 
> > > > driver.
> > > 
> > > there should be no reliance on "cold" being set for normal driver
> > > attach.  it might be a module loaded after boot.  how ever the
> > > driver is loaded, it will need to work without cold being set.
> > 
> > If it's a module laoded after boot, and the driver is not SMP-safe,
> > its attach function has to be called with KERNEL_LOCK held. Or you may
> > have problems when this driver starts receiving interrupts before its
> > attach function has returned (which is typically the case, interrupt enable
> > occurs somewhere in _attach()).
> 
> i don't see how this matters.  drivers should never enable
> interrupts if they're not ready to service them.
> 
> if this is happening, then it is a driver bug.

It looks like we're talking about different things.
I'm talking about this senario, with a non-smp-safe driver.

foo_attach()
{
    do software and hardware initialisation();

    s = splbio();
    enable_interrupt(foo_intr());
    finish_init_and_attach_children();
    splx(s);
    return;
}

foo_intr()
{
        ....
}

In this context, the driver enables interrupt, but don't expect to have its
interrupt handler called before the splx(), because
finish_init_and_attach_children() is running at IPL_BIO. But if foo_attach()
is not called with KERNEL_LOCK held, another CPU may call foo_intr(),
breaking the driver's spl protection.

And I don't think it's the driver's business to make sure KERNEL_LOCK is
held in it's attach routine: it's a non-MPSAFE driver and it should not
even have to know about KERNEL_LOCK.

If attach routines can be called without KERNEL_LOCK held after cold,
then all non-MPSAFE drivers (i.e. almost all of them) needs to be fixed.


> 
> > > in my mind, the scsi code should try to run regardless of the value
> > > of "cold" and that's why i replied above.
> > > 
> > > > What kind of senario do you have in mind ?
> > > 
> > > modules, as above.  or simply drvctl -d / -r.  IMO, only platform
> > > specific code really should depend upon cold.  "scsipi", as a
> > > relatively high level subsystem, should not.
> > 
> > But I don't think it's the job of non-SMP-safe drivers to make sure scsipi
> > is called with KERNEL_LOCK held (and scsipi may be the only subsystem
> > to check for KERNEL_LOCK(), but others do rely on it as well - e.g.
> > sys/dev/ata). Either attach() functions should always be called with
> > KERNEL_LOCK(), or for checks as above we have to accept that when
> > cold, locking is not fully up yet and lock checks should be bypassed.
> 
> the lack of checks elsewhere is not the point here.  we should add
> them to dev/ata (and more) as well, if they depend upon these.
> 
> currently, autoconf isn't run with the kernel lock held at any point
> and i don't think that saying that we should hold it during autoconf
> is a good change.  we should fix the cases that trigger problems when
> they appear, and it sounds like you've hit one in your new driver.
> holding the kernel lock here would be a major step backwards.
> 
> scsipi depends upon kernel lock.  thus, callers should arrange for it
> to be held.  since these are drivers calling, it is upto each driver
> to do this currently.  this is what we've done with other problems we
> have hit as they've arrived.

if the caller is MPSAFE, I agree. Non-MPSAFE driver should not have to care
about KERNEL_LOCK (precisely because they are not MPSAFE).
The basic assumption about kernel locking was that that driver and subsystems
using spl locking would continue to work as it, without changes. If we
change this assumption a lot of code needs to be changed (that is, almost
all our drivers).

> 
> it may be that we have this problem in a large number of drivers but
> these KASSERT()s have been in place for a long time now and, until
> your change, we've fixed it at the caller level not worked around it
> at the scsipi level.

AFAIK the only drivers that has been changed are ncr53c9x and USB, and they
are MP-safe (they already use mutex instead of spl locking).

> 
> your change can't work with the drivers that are broken without the
> major regression of holding kernel lock over autoconfig, or, never
> ever having them be modules or detached/reattached.

Maybe non-MPSAFE drivers can't be modules or detached/reattached, indeed.
But that's a problem with autoconf not dealing with non-MPSAFE drivers,
not the driver themselve (they were working before the MP changes, they
should continue to work as-is). If you want autoconf to not run
under the KERNEL_LOCK when not needed, then is has to be extended so that
MPSAFE drivers can register themselves as MPSAFE, so that non-MPSAFE
drivers are still called with KERNEL_LOCK.

> 
> please make "drvctl -d" and rescan work for your drivers.  this will
> make the checks against "cold" go away.

I don't want to touch each driver in our tree for drvctl -d/rescan to work.
If we want this to work for non-MPSAFE driver, then this has to be fixed
in autoconf (if the problem really exists, which I've not looked at yet).

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index