Subject: RE: Does BUFQ_GET/PUT need a lock?
To: Juergen Hannken-Illjes <hannken@NetBSD.org>
From: Gordon Waidhofer <gww@traakan.com>
List: tech-kern
Date: 07/22/2004 01:55:22
Jürgen Hannken-Illjes - hannken@netbsd.org wrote
> > Multiple CPUs can enter strategy(), perhaps
> > with different partitions each, and try to
> > BUFQ_PUT(). There is no lock to keep things
> > sane. Similarly, the device start() routine
> > could conflict with threads running in strategy().
>
> The disk driver is responsible for locking.
> As the bufq usually is part of a drivers control struct this
> struct should get the spinlock.

I concur with this. Some device drivers need to
simultaneously allocate private resources (control
blocks) along with a BUFQ_GET(). It's best if it's
all under the same lock.

dev/ata/wd.c/wdstrategy() does not lock.
dev/scsipi/sd.c/wdstrategy() does not lock.
I've not checked others, but these are the
biggies.

Similarly, net/if_ethersubr.c/ether_output() does
not lock and has a similar issue.

There are few callers of BUFQ_GET()/BUFQ_PUT()
and so it's reasonable to fix these drivers.

But there are a lot of network drivers calling
IFQ_ENQUEUE()/IFQ_DEQUEUE(). These macros are
complex. The right thing is still for the device
drivers to own the lock. But there may be motivation
to put the locks in the IFQ macros so that all
network drivers get fixed up quickly.

Again, I might be missing something. I've seen
references to a "big lock," but don't know
anything about it. Is the big lock handling
strategy()/start() and output()/start() exclusion?

Regards,
	-gww