Subject: Re: IO throttle VOP
To: None <>
From: David Laight <>
List: tech-kern
Date: 12/18/2001 12:18:50
> > It does actually, I did a special-cased sample implementation
> > as a proof of concept, and it works for the softdep case.
> Could we see that?

I don't doubt that deferring VOP requests until there is enough kma
resource fixes the test you were running!  Indeed deferring the
requests is probably necessary.  I just don't think you've got the
mechanism right.

> > > What would you do with the layered filesystems?
> >
> > Not sure what you mean.. those would usually just pass down the
> > VOP to the lower layer.
> Well, they'll have to transform the vnodes to the lower ones.

Yes and that is all the work - why do it twice!

Additionally a single VOP request into the layered fs might requst in
many calls to the underlying fs.  The fs I have (last port was to
unixware 7MP) supports 32 layers and will dynamically create/delete the
top layer directories in order to create a file.  So a single VOP_INACTIVE
call may cause any number of VOP_RMDIR operations!
> I'd recomend against the array of vnodes. Mainly as that would be a new
> kind of thing to pass around in our vnode interface. Right now each vop
> can pass in up to 4 vnodes, and one vpp*.
> So as a minor nit, I'd vote to make the call just pass in one or two
> vnodes. I don't think we ever should have more than two vnodes locked at
> once (you enter namei() with no locks, and can return with at most two),
> so that should be fine.

Mmmm locked vnodes?  The UW7 vfs stuff manages with a reference count
for almost all purposes.  The onus is left to the fs code to serialise
multiple requests for the same vnode (there is a standard way of stopping
read and write from multiple tasks).  So no vnodes are locked for long

Seems to me that the sequence:
    VOP_THROTTLE( vp, .... );
    VOP_xyz( vp, ... );
is not really different from:
    VOP_xyz( vp, ... );

myfs_xyz( vp, ... ) {
    myfs_throttle( vp, )

The latter needing no extra VOP stuff.
What have I missed?

I guess NFS hits the VOP_xxx functions directly as well - rather that
going through the system call code?
> >
> > Callbacks to free resources are also troublesome, since pushing out
> > softdeps may mean having to take some locks, possibly vnode locks.
> > Also, resource usage may temporarily increase when pushing them
> > out. So, you must do it from a controlled environment, in which
> > you know you can't get into deadlock trouble. The syncer process
> > is such an environment. Others (like the pagedaemon, or even
> > from any other process as part of a callback) will likely lead
> > to disaster.

Yes - but something needs to tell the softdeps code that it better
release some of its resources (or is it always doing it as fast as
it can?).  The process it does the releasing doesn't have to be the
one that made the request - indeed, as you said, that isn't a good
place to try.