Subject: Re: CVS commit: src/sys/dev
To: None <source-changes@NetBSD.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: source-changes
Date: 09/04/2006 10:07:01
On Mon, Sep 04, 2006 at 10:20:29AM +1000, Daniel Carosone wrote:
> On Sun, Sep 03, 2006 at 07:49:34PM +0000, Manuel Bouyer wrote:
> > 
> > Modified Files:
> > 	src/sys/dev: vnd.c
> > 
> > Log Message:
> > Back out rev 1.149.
> > Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293
> > 
> > Also, others stacked block device drivers may also have this issue.
> 
> Yes, and cgd/dk_subr has a solution to essentially the same issue,
> from a different cause.  In that case, the encryption routines need to
> allocate a buffer to copy the ciphertext into, and this can fail in
> interrupt context (when trying to start the next request from a
> completion interrupt of the previous).  If that happens, the strategy
> routine just queues the request, which will be restarted later (and we
> keep a single static buffer around to ensure we can always make
> progress). Perhaps it's time to reiterate my suggestion that vnd be
> restructured similarly.
> 
> I took a quick look at this before you made these changes last time,
> and the changes were fairly large in terms of lines touched, but
> mostly mechanical: mostly, code needed to move from strategy to the
> thread part that runs behind start, and start needed to be split into
> a small part that does the wakeup, and the part that gets woken.  I
> think in the more recent context, this means moving more work into the
> thread you added and making the normal driver entry points just queue
> requests and send wakeups.

AFAIK this won't be enough. With the current code, the strategy routine 
already queues buffers and wakeup the thread. What seems to happens is
that eventually all available buffers ends up in this queue between
strategy and vnd, and the thread doesn't have available buffers to
process the queue (and I suspect, to process a buffer from this queue,
more buffers a needed later, to process write to the vnode, eventually
raidframe or ccd, etc ...).
What we need is a way to limit the size of this queue. And I think all
strategy routine suffers from this, but in the general case it just causes
a performance problem and not a deadlock. Maybe a strategy should be
able to return EAGAIN, and have the caller deal with it.

-- 
Manuel Bouyer, LIP6, Universite Paris VI.           Manuel.Bouyer@lip6.fr
     NetBSD: 26 ans d'experience feront toujours la difference
--