Subject: Re: Redoing file system suspension API (update)
To: None <hannken@eis.cs.tu-bs.de>
From: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
List: tech-kern
Date: 06/20/2006 18:22:12
> > first of all, i tend to think filesystem snapshot thing should be done
> > entirely in filesystem-dependent code.
> 
> Depends on what to expect from suspension.  I expect a file system state
> where system calls are the atomic operations.

isn't it almost the same as VOPs?  (with some exceptions, of course)

> > i don't think it's desirable for each subsystems to put their own
> > random hooks in these places.
> 
> It is possible to put the suspend/resume around calls to device
> functions (d_open, d_read etc) in spec_vnops, device functions (so_receive,
> so_send etc) in fifo_vnops.c, around ttywait(), selcommon() and pollcommon().
> That is what I did in my first proposal.

i don't think this suspend/resume is a good idea at all.

> > what happens if a filesystem itself sleeps with PCATCH?
> > (maybe you can call it a bug, but we currently have such a code.)
> 
> Yes, it is a bug.  Which file system btw?

lfs and nfs, at least.  you can grep. :-)

> > > To solve the rest of 3) it adds a throttling on the first gate not involved
> > > in a suspending file system.
> > 
> > - isn't it normal that an operation become slow when the system has
> >   other activities?
> 
> Slow, yes. But in case of suspension the sync-to-disk becomes very slow.
> Throttling other i/o reduces the time to suspension from > 5 minutes
> to < 30 seconds on my test machine.

- is it true even if filesystems are backed by different disks?
- why does it need the special care?

> > - why you check P_SYSTEM?
> 
> I don't see the above problem (high i/o load) for any system process yet.

checking P_SYSTEM is not an appropriate way to see if a process can involve
high i/o load.

eg.
	- dmover software-backend.
	- we might make nfsd a real kernel thread at some point.

> > > ** The new API is:
> > > 
> > > Using explicit enter()/leave() pairs adds much complexity so I took another
> > > approach. I use two types of gates.  Normal gates need a "leave" operation.
> > > Permanent gates are valid until the thread returns to user mode.
> > 
> > while it can make your patch smaller, i think it's actually more complex
> > and harder to understand and maintain.
> 
> Where is the complexity and maintenance?

it introduces one more thing which should be considered whenever you do
lwp-switch.  it seems complex to me.
i can't believe putting vfs code into ltsleep is a good idea.

> > please try to avoid putting subsystem-specific data to struct lwp.
> 
> If we use permanent gates we have per-thread state.  Where should this state go
> if not into struct lwp?

i meant permanent gate is a bad idea.

> > >   V_NOERROR	Panic on error.  No need for the caller to check the result.
> > 
> > what's the point of this?
> 
> I like style where results are not silently ignored.  Any usage of vngate_enter
> without V_NOERROR and ignoring the result is a coding error.

V_NOERROR in your patch mean that you are sure no error happens in
these places?

> > why you put vngate_enter into FILE_USE, rather than VOPs?

> If you meant putting the gates inside the VOP_XXX functions, this cannot work.

i meant this.

> Some VOPs need to be called with simple locks so we cannot sleep here.

do you mean getpages/putpages?
you can deal with them differently.

YAMAMOTO Takashi