Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: root-on-RAID set always dirty on startup



On Fri, Jul 10, 2009 at 06:16:39PM +0200, Jukka Salmi wrote:
> Jukka Salmi --> current-users (2009-06-12 03:10:11 +0200):
> [...]
> > Rev. 1.377; but wd.c is not the culprit -- the recent file descriptor
> > access performance improvements seem to be: when using sources prior
> > to that [1]change (`cvs up -D 2009.05.23.18.25.00') parity is clean on
> > startup; with sources after it (`cvs up -D 2009.05.23.18.28.10') parity
> > is always dirty.
> 
> Some debugging revealed that during shutdown raid0 (where the root file
> system resides) is not detached, thus causing the dirty parity on
> startup IIUC; raid1 is detached correctly, BTW.
> 
> Not being familiar with the code at all, I added some debugging printfs
> to raidclose() (see attached diff).  With these changes, shutting down
> the system looks like:
> 
>       [...]
>       raidclose(1,6)
>       dk_bopenmask prae: 40
>       dk_bopenmask post: 0
>       dk_openmasks: c=0, b=0, c|b=0
>       doing_shutdown: 1
>       raid1: detached
>       detaching and destroying raid1... done
>       unmounted /dev/raid1g on /a type ffs
>       raidclose(0,0)
>       dk_bopenmask prae: 1
>       dk_bopenmask post: 0
>       dk_openmasks: c=1, b=0, c|b=1
>       forcefully unmounted /dev/raid0a on / type ffs
>       
>       The operating system has halted.
>       Please press any key to reboot.
> 
> So, why is raid0's dk_copenmask non-zero?  (The device which is still
> open according to the mask, raid0a, is where the root file system is
> on.)
> 
> BTW, removing the calls to fd_hold() and fd_free() in lwp_create() and
> lwp_exit() respectively (see other attached diff) causes raid0 to be
> detached on system shutdown just fine:

I have a similar problem.  See the thread 'sc->sc_dk.dk_copenmask == 1
after /etc/rc.d/fsck_root' / 'fd_refcnt leak?' on tech-kern.

It looks to me like the fd_hold() call in lwp_create() will increase
the reference count on the wrong process's filedesc_t when it is called
from fork1().  I believe that fd_exit() is called the correct number of
times, and on the correct LWP, but somebody who understands the code
better than I should double-check.

What if we change the fd_hold() call in lwp_create() to
atomic_inc_uint(&l2->l_fd->fd_refcnt) ?

Dave

-- 
David Young             OJC Technologies
dyoung%ojctech.com@localhost      Urbana, IL * (217) 278-3933


Home | Main Index | Thread Index | Old Index