NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/54969 (Disk cache is no longer flushed on shutdown)



The following reply was made to PR kern/54969; it has been noted by GNATS.

From: "Greg A. Woods" <woods%planix.ca@localhost>
To: NetBSD-current Users's Discussion List <current-users%netbsd.org@localhost>,
    NetBSD GNATS <gnats-bugs%NetBSD.org@localhost>
Cc: NetBSD Users's Discussion List <netbsd-users%netbsd.org@localhost>
Subject: Re: kern/54969 (Disk cache is no longer flushed on shutdown)
Date: Thu, 25 Mar 2021 11:14:48 -0700

 --pgp-sign-Multipart_Thu_Mar_25_11:14:44_2021-1
 Content-Type: text/plain; charset=US-ASCII
 
 So, the reason I jumped from what I thought was a relatively stable
 point in the main -current branch to a more recent version was primarily
 because of what I believe is a problem related to PR# 54969.
 
 I had noticed long fscks on large filesystems following normal clean
 reboots and got investigating.
 
 Maybe what remains an issue here is just related to dm(4) partitions, as
 only my /dev/mapper partition(s) have had problems recently.
 
 Unfortunately though this is still happening with 9.99.81 (2021-03-10).
 (and both with GENERIC and XEN3_DOM0)
 
 In any case I would say this is the single most critical, serious, and
 important, issue in current (and netbsd-9)!  It totally kills system
 reliability (though maybe only if one is using LVM).
 
 Just for evidence, I added a bunch more printfs to the kernel and rc.d
 scripts (and '-v' flags to fsck, mount, etc.) to help me see for myself
 better what exactly is going on.  This is the console after a truly
 normal complete safe reboot using shutdown(8).
 
 In this example all processes but the shutdown scripts should be dead.
 The NFS mount on /more/work probably won't complete because I probably
 started shutdown(8) without first doing "cd /" (and without doing "exec
 shutdown"), and my CWD was probably on that NFS mount.  This could maybe
 be fixed by having reboot/halt/powerdown kill its parent process first,
 and perhaps also doing chdir("/") too.
 
 There's no excuse I can find for /build not unmounting though, and
 definitely no excuse for '/' not umounting either, though it later '/'
 is forcefully unmounted, and on reboot '/' appears to be clean.  However
 the forceful unmount of /build doesn't work, and it is NOT clean.
 
 Note also that /build will sometimes unmount quickly and cleanly if it
 hasn't been dirtied since the last boot, but it seems even creating one
 file can leave it dirty on reboot.
 
 Maybe what remains an issue here is just related to dm(4) partitions?
 
 
 [Wed Mar 24 20:42:56 2021][ 715713.0781096] syncing disks... done
 [Wed Mar 24 20:42:56 2021][ 715713.2081201] unmounted more.local:/vcs from /more/vcs, type nfs
 [Wed Mar 24 20:42:56 2021][ 715713.2481211] unmount of /more/work (more.local:/work) failed with error 16
 [Wed Mar 24 20:42:56 2021][ 715713.2581208] unmounted more.local:/home from /more/home, type nfs
 [Wed Mar 24 20:42:56 2021][ 715713.2581208] unmounted more.local:/archive from /more/archive, type nfs
 [Wed Mar 24 20:42:57 2021][ 715714.0781691] unmount of /build (/dev/mapper/scratch-build) failed with error 16
 [Wed Mar 24 20:42:57 2021][ 715714.0781691] unmounted procfs from /proc, type procfs
 [Wed Mar 24 20:42:57 2021][ 715714.0781691] unmounted ptyfs from /dev/pts, type ptyfs
 [Wed Mar 24 20:42:57 2021][ 715714.0781691] unmounted kernfs from /kern, type kernfs
 [Wed Mar 24 20:42:58 2021][ 715714.6782049] unmounted /dev/dk3 from /usr/pkg, type ffs
 [Wed Mar 24 20:42:58 2021][ 715714.7282939] unmounted /dev/dk2 from /var, type ffs
 [Wed Mar 24 20:42:58 2021][ 715714.8282434] unmount of / (/dev/dk0) failed with error 16
 [Wed Mar 24 20:42:58 2021][ 715714.8282434] WARNING: some file systems would not unmount
 [Wed Mar 24 20:42:58 2021][ 715714.8282434] unmount of /more/work (more.local:/work) failed with error 16
 [Wed Mar 24 20:42:58 2021][ 715714.8282434] unmount of /build (/dev/mapper/scratch-build) failed with error 16
 [Wed Mar 24 20:42:58 2021][ 715714.8282434] unmount of / (/dev/dk0) failed with error 16
 [Wed Mar 24 20:42:58 2021][ 715714.8282434] WARNING: some file systems would not unmount
 [Wed Mar 24 20:42:59 2021][ 715716.5383256] brgphy1: detached
 
 	[[ ... almost all the rest of devices detach ... ]]
 
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] unmount of /more/work (more.local:/work) failed with error 16
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] unmount of /build (/dev/mapper/scratch-build) failed with error 16
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] unmount of / (/dev/dk0) failed with error 16
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] WARNING: some file systems would not unmount
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] sd1: detached
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] unmount of /more/work (more.local:/work) failed with error 16
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] unmount of /build (/dev/mapper/scratch-build) failed with error 16
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] unmount of / (/dev/dk0) failed with error 16
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] WARNING: some file systems would not unmount
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] forcefully unmounting more.local:/work from /more/work...
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] forcefully unmounted more.local:/work from /more/work, type nfs
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] unmount of /build (/dev/mapper/scratch-build) failed with error 16
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] unmount of / (/dev/dk0) failed with error 16
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] WARNING: some file systems would not unmount
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] forcefully unmounting /dev/mapper/scratch-build from /build...
 [Wed Mar 24 20:43:02 2021][ 715718.5284461] forcefully unmounted /dev/mapper/scratch-build from /build, type ffs
 [Wed Mar 24 20:43:02 2021][ 715718.5384534] unmount of / (/dev/dk0) failed with error 16
 [Wed Mar 24 20:43:02 2021][ 715718.5384534] WARNING: some file systems would not unmount
 [Wed Mar 24 20:43:02 2021][ 715718.5384534] forcefully unmounting /dev/dk0 from /...
 [Wed Mar 24 20:43:02 2021][ 715718.5384534] forcefully unmounted /dev/dk0 from /, type ffs
 [Wed Mar 24 20:43:02 2021][ 715718.5384534] unmounting done
 [Wed Mar 24 20:43:02 2021][ 715718.5384534] turning off swap... done
 [Wed Mar 24 20:43:02 2021][ 715718.5384534] dk0 at sd0 (/) deleted
 [Wed Mar 24 20:43:02 2021][ 715718.5384534] sd0: detached
 [Wed Mar 24 20:43:02 2021][ 715718.5384534] scsibus0: detached
 [Wed Mar 24 20:43:02 2021][ 715718.7184994] mfi0: detached
 [Wed Mar 24 20:43:02 2021][ 715718.7184994] pci8: detached
 [Wed Mar 24 20:43:02 2021][ 715718.7184994] ppb7: detached
 [Wed Mar 24 20:43:02 2021][ 715718.7184994] unmounting done
 [Wed Mar 24 20:43:02 2021][ 715718.7184994] turning off swap... done
 [Wed Mar 24 20:43:02 2021][ 715718.7184994] rebooting...
 
 	[[ ... why is "turning off swap" seen twice? .. ]]
 
 	[[ ... and then the reboot, until rc scripts say ... ]]
 
 [Wed Mar 24 20:44:51 2021]Starting root file system check:
 [Wed Mar 24 20:44:51 2021]/dev/rdk0: file system is clean; not checking
 [Wed Mar 24 20:44:51 2021]start / wait fsck_ffs -p /dev/rdk0
 
 
 [Wed Mar 24 20:44:52 2021]Starting file system checks:
 [Wed Mar 24 20:44:52 2021]/dev/rdk2: file system is clean; not checking
 [Wed Mar 24 20:44:52 2021]/dev/rdk3: file system is clean; not checking
 
 	[[ ... here I hit ^T on the console as it was taking too long ... ]]
 
 [Wed Mar 24 20:44:58 2021][  15.0201108] load: 0.08  cmd: sleep 345 [nanoslp] 0.00u 0.00s 0% 512k
 [Wed Mar 24 20:44:58 2021]/dev/mapper/rscratch-build: phase 1: cyl group 24 of 345 (6%)
 [Wed Mar 24 20:46:09 2021]/dev/mapper/rscratch-build: phase 1: cyl group 284 of 345 (82%)
 [Wed Mar 24 20:49:30 2021]/dev/mapper/rscratch-build: 1400986 files, 36172587 used, 28347707 free (17403 frags, 3541288 blocks, 0.0% fragmentation)
 [Wed Mar 24 20:49:30 2021]/dev/mapper/rscratch-build: MARKING FILE SYSTEM CLEAN
 [Wed Mar 24 20:49:30 2021]start /var nowait fsck_ffs -p /dev/rdk2
 [Wed Mar 24 20:49:30 2021]start /build nowait fsck_ffs -p /dev/mapper/rscratch-build
 [Wed Mar 24 20:49:30 2021]done ffs: /dev/rdk2 (/var) = 0x0
 [Wed Mar 24 20:49:30 2021]start /usr/pkg nowait fsck_ffs -p /dev/rdk3
 [Wed Mar 24 20:49:30 2021]done ffs: /dev/rdk3 (/usr/pkg) = 0x0
 [Wed Mar 24 20:49:30 2021]done ffs: /dev/mapper/rscratch-build (/build) = 0x0
 [Wed Mar 24 20:49:30 2021]Script /etc/rc.d/fsck running
 [Wed Mar 24 20:49:30 2021]Currently sourcing /etc/rc.d/fsck
 [Wed Mar 24 20:49:30 2021]exec: mount_ffs -o rw /dev/dk2 /var
 [Wed Mar 24 20:49:30 2021]exec: mount_ffs -o rw /dev/dk2 /var
 [Wed Mar 24 20:49:30 2021]/dev/dk2 on /var type ffs (local, fsid: 0xa802/0x78b, reads: sync 1 async 0, writes: sync 2 async 0)
 
 
 --
 					Greg A. Woods <gwoods%acm.org@localhost>
 
 Kelowna, BC     +1 250 762-7675           RoboHack <woods%robohack.ca@localhost>
 Planix, Inc. <woods%planix.com@localhost>     Avoncote Farms <woods%avoncote.ca@localhost>
 
 --pgp-sign-Multipart_Thu_Mar_25_11:14:44_2021-1
 Content-Type: application/pgp-signature
 Content-Transfer-Encoding: 7bit
 Content-Description: OpenPGP Digital Signature
 
 -----BEGIN PGP SIGNATURE-----
 
 iF0EABECAB0WIQRuK6dmwVAucmRxuh9mfXG3eL/0fwUCYFzTGAAKCRBmfXG3eL/0
 fxLoAKD6ybZOT8vzuIU0Ayww7xKcGDnAwgCdF5bpzfJz9Ux+eoSkIMrdNYOenQg=
 =woZJ
 -----END PGP SIGNATURE-----
 
 --pgp-sign-Multipart_Thu_Mar_25_11:14:44_2021-1--
 


Home | Main Index | Thread Index | Old Index