tech-kern: Re: vnode locking problem

Subject: Re: vnode locking problem
To: Chuck Silvers <chuq@chuq.com>
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
List: tech-kern
Date: 03/24/1999 11:29:39

> that should do well enough for now.
> it's certainly better than hanging or panicing.
> when testing, you should try the write-into-mapping case as well
> as the read-into-mapping case that I described in the PR.

Yup, will do.

Random other thoughs:

 - it's clear that this is easier to fix in a unified-buffer-cache
setup, where VOP_READ gets replaced with "map vnode into kernel VA
space, copyout, maybe unmap vnode", *assuming* that the copyout is
done while the syscall has the vnode unlocked.

 - it also feels "wrong" that a copyout which replaces the entire
contents of a page which isn't present first has to bring that page in
from backing store.  (this isn't always going to be the case for
situations like this, but it may often be).

 - Fixing the vnode locking protocols so that reads could be done with
a shared lock would be desirable (particularly for MP scalability) but
could be tricky.  (you still need an exclusive lock on the file
pointer, but that's a separate issue).  I've been thinking about a
related issue (using shared locks when possible for VOP_LOOKUP to
minimize how badly things lose when a filesystem gets unresponsive)
and may look into this as well after 1.4..

 - the upper-layer code could possibly notice this case of EFAULT,
unlock everything, touch the pages, relock, and retry the read, but
knowing when it's safe to redo the read is not immediately clear, and
this just feels wrong..

 - the comment you had in the PR about the problem also occurring when
the buffer should have been in BSS: I believe that linkers typically
place bss immediately after initialized data, without rounding up to
the next page... so the first part of the buffer was probably still in
.data and backed by the vnode..

					- Bill