Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: root-on-RAID set always dirty on startup



Jukka Salmi --> current-users (2009-06-12 03:10:11 +0200):
[...]
> Rev. 1.377; but wd.c is not the culprit -- the recent file descriptor
> access performance improvements seem to be: when using sources prior
> to that [1]change (`cvs up -D 2009.05.23.18.25.00') parity is clean on
> startup; with sources after it (`cvs up -D 2009.05.23.18.28.10') parity
> is always dirty.

Some debugging revealed that during shutdown raid0 (where the root file
system resides) is not detached, thus causing the dirty parity on
startup IIUC; raid1 is detached correctly, BTW.

Not being familiar with the code at all, I added some debugging printfs
to raidclose() (see attached diff).  With these changes, shutting down
the system looks like:

        [...]
        raidclose(1,6)
        dk_bopenmask prae: 40
        dk_bopenmask post: 0
        dk_openmasks: c=0, b=0, c|b=0
        doing_shutdown: 1
        raid1: detached
        detaching and destroying raid1... done
        unmounted /dev/raid1g on /a type ffs
        raidclose(0,0)
        dk_bopenmask prae: 1
        dk_bopenmask post: 0
        dk_openmasks: c=1, b=0, c|b=1
        forcefully unmounted /dev/raid0a on / type ffs
        
        The operating system has halted.
        Please press any key to reboot.

So, why is raid0's dk_copenmask non-zero?  (The device which is still
open according to the mask, raid0a, is where the root file system is
on.)

BTW, removing the calls to fd_hold() and fd_free() in lwp_create() and
lwp_exit() respectively (see other attached diff) causes raid0 to be
detached on system shutdown just fine:

        [...]
        raidclose(0,0)
        dk_bopenmask prae: 1
        dk_bopenmask post: 0
        dk_openmasks: c=0, b=0, c|b=0
        doing_shutdown: 1
        raid0: detached
        detaching and destroying raid0... done
        forcefully unmounted /dev/raid0a on / type ffs
        
        The operating system has halted.
        Please press any key to reboot.

Any hints?


Regards, Jukka

> [1] http://mail-index.netbsd.org/source-changes/2009/05/23/msg221612.html

-- 
This email fills a much-needed gap in the archives.
Index: sys/dev/raidframe/rf_netbsdkintf.c
===================================================================
RCS file: /cvsroot/src/sys/dev/raidframe/rf_netbsdkintf.c,v
retrieving revision 1.265
diff -u -p -r1.265 rf_netbsdkintf.c
--- sys/dev/raidframe/rf_netbsdkintf.c  10 Jun 2009 14:17:13 -0000      1.265
+++ sys/dev/raidframe/rf_netbsdkintf.c  10 Jul 2009 15:50:17 -0000
@@ -810,20 +810,30 @@ raidclose(dev_t dev, int flags, int fmt,
                return (error);
 
        part = DISKPART(dev);
+       printf("raidclose(%d,%d)\n", unit, part);
 
        /* ...that much closer to allowing unconfiguration... */
        switch (fmt) {
        case S_IFCHR:
+               printf("dk_copenmask prae: %x\n", rs->sc_dkdev.dk_copenmask);
                rs->sc_dkdev.dk_copenmask &= ~(1 << part);
+               printf("dk_copenmask post: %x\n", rs->sc_dkdev.dk_copenmask);
                break;
 
        case S_IFBLK:
+               printf("dk_bopenmask prae: %x\n", rs->sc_dkdev.dk_bopenmask);
                rs->sc_dkdev.dk_bopenmask &= ~(1 << part);
+               printf("dk_bopenmask post: %x\n", rs->sc_dkdev.dk_bopenmask);
                break;
        }
        rs->sc_dkdev.dk_openmask =
            rs->sc_dkdev.dk_copenmask | rs->sc_dkdev.dk_bopenmask;
 
+       printf("dk_openmasks: c=%x, b=%x, c|b=%x\n",
+           rs->sc_dkdev.dk_copenmask,
+           rs->sc_dkdev.dk_bopenmask,
+           rs->sc_dkdev.dk_openmask);
+
        if ((rs->sc_dkdev.dk_openmask == 0) &&
            ((rs->sc_flags & RAIDF_INITED) != 0)) {
                /* Last one... device is not unconfigured yet.
@@ -833,6 +843,7 @@ raidclose(dev_t dev, int flags, int fmt,
 
                rf_update_component_labels(raidPtrs[unit],
                                                 RF_FINAL_COMPONENT_UPDATE);
+               printf("doing_shutdown: %d\n", doing_shutdown);
                if (doing_shutdown) {
                        /* last one, and we're going down, so
                           lights out for this RAID set too. */
@@ -844,12 +855,15 @@ raidclose(dev_t dev, int flags, int fmt,
                        /* detach the device */
                        
                        cf = device_cfdata(rs->sc_dev);
-                       error = config_detach(rs->sc_dev, DETACH_QUIET);
+                       error = config_detach(rs->sc_dev, 0 /*DETACH_QUIET*/);
                        free(cf, M_RAIDFRAME);
                        
                        /* Detach the disk. */
+                       printf("detaching and destroying %s... ",
+                           rs->sc_dkdev.dk_name);
                        disk_detach(&rs->sc_dkdev);
                        disk_destroy(&rs->sc_dkdev);
+                       printf("done\n");
                }
        }
 
Index: sys/kern/kern_lwp.c
===================================================================
RCS file: /cvsroot/src/sys/kern/kern_lwp.c,v
retrieving revision 1.131
diff -u -p -r1.131 kern_lwp.c
--- sys/kern/kern_lwp.c 23 May 2009 18:28:06 -0000      1.131
+++ sys/kern/kern_lwp.c 10 Jul 2009 15:50:25 -0000
@@ -606,7 +606,9 @@ lwp_create(lwp_t *l1, proc_t *p2, vaddr_
        l2->l_fd = p2->p_fd;
        if (p2->p_nlwps != 0) {
                KASSERT(l1->l_proc == p2);
+#if 0
                fd_hold();
+#endif
        } else {
                KASSERT(l1->l_proc != p2);
        }
@@ -766,7 +768,9 @@ lwp_exit(struct lwp *l)
                (*p->p_emul->e_lwp_exit)(l);
 
        /* Drop filedesc reference. */
+#if 0
        fd_free();
+#endif
 
        /* Delete the specificdata while it's still safe to sleep. */
        specificdata_fini(lwp_specificdata_domain, &l->l_specdataref);


Home | Main Index | Thread Index | Old Index