At Tue, 30 Jun 2020 12:52:37 -0700, "Greg A. Woods" <woods%planix.com@localhost> wrote: Subject: So it seems "umount -f /nfs/mount" still doesn't work..... > Curiously the kernel now does something I didn't quite expect when one tries to reboot a system with a stuck mount. I was able to see this as I was running a kernel that verbosely documents all its shutdown unmounts and detaches. In prior times I had reached for the power switch. At first it just hangs: lilbit# reboot -q [ 1131744.8297338] syncing disks... 3 3 done [ 1131744.9797408] unmounting 0xc1f27000 /more/work (more.local:/work)... [ 1131744.9907053] ok [ 1131744.9907053] unmounting 0xc1f24000 /more/archive (more.local:/archive)... [ 1131745.0004431] ok [ 1131745.0004431] unmounting 0xc1f21000 /more/home (more.local:/home)... [ 1131745.0097426] ok [ 1131745.0097426] unmounting 0xc1f1f000 /once/build (once.local:/build)... [ 1131745.0097426] ok [ 1131745.0210854] unmounting 0xc1f1b000 /future/build (future.local:/build)... [ 1131745.0210854] ok [ 1131745.0304676] unmounting 0xc1f11000 /building/build (building.local:/build)... .... this is me hitting ^T to try to see what's going on .... [ 1131753.2800902] load: 0.52 cmd: reboot 7414 [fstcnt] 0.00u 0.16s 0% 424k [ 1132107.6651517] load: 0.48 cmd: reboot 7414 [fstcnt] 0.00u 0.16s 0% 424k [ 1133247.8436109] load: 0.48 cmd: reboot 7414 [fstcnt] 0.00u 0.16s 0% 424k .... then I hit ^C and immediately it proceeded .... ^C[ 1133249.3636755] unmounting 0xc1f0f000 /proc (procfs)... [ 1133249.3636755] ok [ 1133249.3636755] unmounting 0xc1f0d000 /dev/pts (ptyfs)... [ 1133249.3788641] unmounting 0xc1ecb000 /kern (kernfs)... [ 1133249.3843127] ok [ 1133249.3843127] unmounting 0xc1ec9000 /cache (/dev/wd1a)... [ 1133249.7636916] ok [ 1133249.7636916] unmounting 0xc1ec6000 /home (/dev/wd0g)... [ 1133249.7736976] unmounting 0xc1dd7000 /usr/pkg (/dev/wd0f)... [ 1133250.0737098] unmounting 0xc1ab1000 /var (/dev/wd0e)... [ 1133250.1537121] unmounting 0xc1804000 / (/dev/wd0a)... [ 1133251.0337515] unmounting 0xc1f11000 /building/build (building.local:/build)... [ 1133251.0469644] unmounting 0xc1f0d000 /dev/pts (ptyfs)... [ 1133251.0469644] unmounting 0xc1ec6000 /home (/dev/wd0g)... [ 1133251.0579007] unmounting 0xc1dd7000 /usr/pkg (/dev/wd0f)... [ 1133251.0637673] unmounting 0xc1ab1000 /var (/dev/wd0e)... [ 1133251.0637673] unmounting 0xc1804000 / (/dev/wd0a)... [ 1133251.0750403] sd0: detached [ 1133251.0750403] scsibus0: detached [ 1133251.0750403] gpio1: detached [ 1133251.0853614] sysbeep0: detached [ 1133251.0853614] midi0: detached [ 1133251.0853614] wd1: detached [ 1133251.0949369] uhub0: detached [ 1133251.0949369] com1: detached [ 1133251.0949369] usb0: detached [ 1133251.1045456] gpio0: detached [ 1133251.1045456] ohci0: detached [ 1133251.1045456] pchb0: detached [ 1133251.1151702] unmounting 0xc1f11000 /building/build (building.local:/build)... [ 1133251.1151702] unmounting 0xc1f0d000 /dev/pts (ptyfs)... [ 1133251.1279509] unmounting 0xc1ec6000 /home (/dev/wd0g)... [ 1133251.1279509] unmounting 0xc1dd7000 /usr/pkg (/dev/wd0f)... [ 1133251.1393918] unmounting 0xc1ab1000 /var (/dev/wd0e)... [ 1133251.1448739] unmounting 0xc1804000 / (/dev/wd0a)... [ 1133251.1448739] forcefully unmounting /building/build (building.local:/build)... [ 1133251.1587138] forceful unmount of /building/build failed with error -3 [ 1133251.1653872] rebooting... So it seems there's some contention between the internal attempt to unmount the stuck NFS filesystem(s), and the reboot system call itself, but if the reboot command is interrupted, then the kernel can get on with its shutdown procedures, and eventually it actually forces the unmount of the stuck NFS filesystem. Another interesting thing to note is that /future/build was also stuck as future.local is offline at this time. However that's the filesystem I tried to clear first by hand with "umount -f /future/build", but that was stuck, apparently in the same call to nfs_reconnect(). It seems it had done enough that when the reboot() triggered unmounting that it could complete the unmount without problems. (The other mounts on more.local and once.local were responding so they unmounted normally.) -- Greg A. Woods <gwoods%acm.org@localhost> Kelowna, BC +1 250 762-7675 RoboHack <woods%robohack.ca@localhost> Planix, Inc. <woods%planix.com@localhost> Avoncote Farms <woods%avoncote.ca@localhost>
Attachment:
pgpY7aR5sgTkS.pgp
Description: PGP signature