Subject: Re: i thought
To: Bill Studenmund <wrstuden@netbsd.org>
From: Pavel Cahyna <cahyna@pc313.imc.cas.cz>
List: tech-kern
Date: 07/26/2005 11:37:29
On Mon, 25 Jul 2005 12:29:42 -0700, Bill Studenmund wrote:

> On Mon, Jul 25, 2005 at 09:19:42PM +0200, Martin Husemann wrote:
>> On Mon, Jul 25, 2005 at 08:22:39PM +0200, Zeljko Vrba wrote:
>> > quite often everything just stops due to ioflush. And I mean, STOPS.
>> 
>> What are you doing? NFS over a 9600 bps SLIP link?
> 
> No, he's not.
> 
> The problem is that our UBC system doesn't put back-pressure on programs. 
> You can dirty pages with mmap very quickly yet not trigger back-writing of 
> the pages. So you fill up all of RAM, then have to wait for the ioflush 
> system to turn on and flush data out.

Are you sure that filling of RAM by mmap is the real problem? I observed
the same, when a process was using a large database. ktrace showed that
it was doing a lot of reads and writes to that large file. After some
time, it was blocked in read() for many seconds, CPU usage dropped to
almost zero and disk queue was filled with write requests. The process was
waiting on vnlock, not for memory. I looked in DDB which process is
holding the lock - it was ioflush!

So IMHO the problem is the lock contention for vnode locks. When ioflush
flushes that file, it locks it, and other operation on this file is
impossible until it finishes. (I believe this is true even if the
requested data are in page cache, because VOP_READ is always called with
the vnode locked.)

> Part of the problem is that we flush "files". Thus we will flush or not
> flush a file on a file by file basis. This works great if we have a lot
> of writes to different files; we will hit on different files, and keep a
> steady stream of data going to the disks.

I'm afraid that even if a steady stream was achieved, there would be
exactly the same waste of CPU time, because ioflush would keep the vnode
locked for the same amount of time.

Is my analysis correct?

Bye	Pavel