Subject: Re: PR/34293 CVS commit: src/sys/dev
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Michael van Elst <mlelstv@serpens.de>
List: netbsd-bugs
Date: 09/07/2006 17:40:03
The following reply was made to PR kern/34293; it has been noted by GNATS.

From: Michael van Elst <mlelstv@serpens.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: PR/34293 CVS commit: src/sys/dev
Date: Thu, 7 Sep 2006 19:36:30 +0200

 On Thu, Sep 07, 2006 at 12:55:03PM +0000, YAMAMOTO Takashi wrote:
 > The following reply was made to PR kern/34293; it has been noted by GNATS.
 > 
 > From: yamt@mwd.biglobe.ne.jp (YAMAMOTO Takashi)
 > To: mlelstv@serpens.de
 > Cc: gnats-bugs@NetBSD.org, kern-bug-people@NetBSD.org,
 > 	gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
 > Subject: Re: PR/34293 CVS commit: src/sys/dev
 > Date: Thu,  7 Sep 2006 21:51:01 +0900 (JST)
 > 
 >  > >  does your "mkfs.ext2" use a block device, rather than a raw device?
 >  > 
 >  > It's not mine but it does.
 >  
 >  ok, then i think i understand the problem.
 >  we should throttle activities creating dirty buffers.
 >  maybe by having a flag for getblk and friends to tell
 >  "we are not going to dirty the buffer".
 
 If I remember previous discussions there was some opposition in
 restricting writes this way. Please check the discussions about
 untarring source trees and waiting for X11 to respond again.
 
 Throttling getblk does not prevent the deadlock. For that you need
 to distinguish between the upper filesystem or block device, that
 needs to be throttled and the lower filesystem (or rather lowest
 if you start nesting deeper), that must be kept running to avoid the
 deadlock.
 
 There is also the problem about feedback. How would you know when
 to throttle (or rather when to stop)?. You could have a mechanism
 that looks at all the device queues, if buffers pile up there
 because the device is too slow, you stop generating more.
 
 But throttling getblk isn't good enough, you only want to throttle
 getting buffers that end in the particular slow device queue. And
 getblk doesn't have that information.
 
 The SoC-Project that adds congestion control to filesystems might
 eventually address this problem correctly.
 
 
 >  what your patch does is the opposite.  ie. throttling attempts of
 >  cleaning buffers.  i believe it makes the situation worse.
 
 The patch does not throttle attempts of cleaning buffers. The
 bottleneck for cleaning buffers is still the underlying filesystem
 where the virtual device file resides.
 
 With the patch when vnd throttles the writer it does two things:
 
 it prevents it from generating more dirty buffers and
 
 it delays the requeuing of dirty buffers _when the underlying
    filesystem is too slow to process them_.
 
 As a result it may not utilize the buffer cache effectively.
 You could provide a knob to control the queue length, currently
 that's hardcoded 8 for local filesystems and 2 for NFS.
 
 It also affects vnd operation only, unlike anything that interfers
 with the more global getblk.
 
 
 Greetings,
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."