Current-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: root-on-RAID set always dirty on startup
On Fri, Jul 10, 2009 at 06:16:39PM +0200, Jukka Salmi wrote:
> Jukka Salmi --> current-users (2009-06-12 03:10:11 +0200):
> [...]
> > Rev. 1.377; but wd.c is not the culprit -- the recent file descriptor
> > access performance improvements seem to be: when using sources prior
> > to that [1]change (`cvs up -D 2009.05.23.18.25.00') parity is clean on
> > startup; with sources after it (`cvs up -D 2009.05.23.18.28.10') parity
> > is always dirty.
>
> Some debugging revealed that during shutdown raid0 (where the root file
> system resides) is not detached, thus causing the dirty parity on
> startup IIUC; raid1 is detached correctly, BTW.
>
> Not being familiar with the code at all, I added some debugging printfs
> to raidclose() (see attached diff). With these changes, shutting down
> the system looks like:
>
> [...]
> raidclose(1,6)
> dk_bopenmask prae: 40
> dk_bopenmask post: 0
> dk_openmasks: c=0, b=0, c|b=0
> doing_shutdown: 1
> raid1: detached
> detaching and destroying raid1... done
> unmounted /dev/raid1g on /a type ffs
> raidclose(0,0)
> dk_bopenmask prae: 1
> dk_bopenmask post: 0
> dk_openmasks: c=1, b=0, c|b=1
> forcefully unmounted /dev/raid0a on / type ffs
>
> The operating system has halted.
> Please press any key to reboot.
>
> So, why is raid0's dk_copenmask non-zero? (The device which is still
> open according to the mask, raid0a, is where the root file system is
> on.)
>
> BTW, removing the calls to fd_hold() and fd_free() in lwp_create() and
> lwp_exit() respectively (see other attached diff) causes raid0 to be
> detached on system shutdown just fine:
I have a similar problem. See the thread 'sc->sc_dk.dk_copenmask == 1
after /etc/rc.d/fsck_root' / 'fd_refcnt leak?' on tech-kern.
It looks to me like the fd_hold() call in lwp_create() will increase
the reference count on the wrong process's filedesc_t when it is called
from fork1(). I believe that fd_exit() is called the correct number of
times, and on the correct LWP, but somebody who understands the code
better than I should double-check.
What if we change the fd_hold() call in lwp_create() to
atomic_inc_uint(&l2->l_fd->fd_refcnt) ?
Dave
--
David Young OJC Technologies
dyoung%ojctech.com@localhost Urbana, IL * (217) 278-3933
Home |
Main Index |
Thread Index |
Old Index