NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/52783: parallel fsck hangs during boot of 8.99.[5678]



>Number:         52783
>Category:       kern
>Synopsis:       parallel fsck hangs during boot of 8.99.[5678]
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Dec 03 20:30:00 +0000 2017
>Originator:     Onno van der Linden
>Release:        NetBSD 8.99.8
>Organization:

>Environment:
System: NetBSD sheep 8.99.8 NetBSD 8.99.8 (SHEEPKMS) #0: Sun Dec 3 15:49:57 CET 2017 onno@sheep:/usr/src/sys/arch/i386/compile/SHEEPKMS i386
Architecture: i386
Machine: i386
>Description:
fsck called from /etc/rc.d/fsck with default (parallel) options frequently hangs
My /etc/fstab:
/dev/wd0a	/	ffs	rw	1 1
/dev/wd0b	none	swap	sw	0 0
/dev/wd0e	/home	ffs	rw,log	1 2
/dev/wd0f	/usr	ffs	rw,log	1 2
/dev/wd0g	/usr/pkg	ffs	rw,log	1 2
/dev/wd0h	/usr/src	ffs	rw,log	1 2
/dev/wd0i	/usr/xsrc	ffs	rw,log	1 2
/dev/wd0j	/cd	ffs	rw,log	1 2
/dev/wd0k	/cd3	ffs	rw,log	1 2
/dev/wd1a	/var	ffs	rw,log	1 2
/dev/wd1b	/tmp	ffs	rw,async	1 2
/dev/wd1e	/usr/obj	ffs	rw,async	1 2
/dev/wd1f	/usr/pkgsrc	ffs	rw,log	1 2
/dev/wd1g	/cd2	ffs	rw,log	1 2
/dev/wd1h	/cd4	ffs	rw,log	1 2
NAME=dk0	/cd6	ffs	rw,log	1 2
NAME=dk1	/cd7	ffs	rw,log	1 2
NAME=dk2	/cd8	ffs	rw,log	1 2
NAME=dk3	/cd9	ffs	rw,log	1 2
/dev/cd0a       /cdrom  cd9660  ro,noauto,nodev,nosuid  0 0
kernfs		/kern	kernfs	rw	0 0
procfs		/proc	procfs	rw,linux
tmpfs		/usr/pkg/emul/linux/dev/shm tmpfs rw,-m1777
tmpfs	/var/shm	tmpfs	rw,-m1777,-sram%25

On a 8.99.8 kernel in DDB I get (copying from notes):
58 fsck biolock getblk->bbusy->cv_timed_wait->sleepq_block
57 fsck_ffs tstile do_sys_sync->mutex_vector_enter->turnstile_block->sleepq_block
54 fsck_ffs biolock same stack as 58
50 fsck wait
46 sh wait
17 sh tstile genfs_lock->rw_vector_enter->turnstile_block->sleepq_block
0.60 ioflush biolock 

On a 8.99.5 kernel (sources up to Nov 2 2017 0:00) compiled with lock_debug
1 fsck_ffs process hangs in biowait with stack (copying from notes again)
wdopen->dk_open->dk_getdisklabel->readdisklabel->scan_mbr->biowait->cv_wait->sleepq_block

Note that this stack trace is somewhat identical to that of thread 59 in PR 46462.
And when doing a call cpu_reboot in DDB with this kernel I get
panic LOCKDEBUG: mutex_vector_enter,510 acquiring sleep lock from interrupt context

I can easily reproduce it with slightly older kernels from November. It took
a couple of attempts with 8.99.8 and with different stack traces compared to
earlier kernels.

I'm also wondering if any of this is related to PR's 46462 and 52769.


>How-To-Repeat:
(Re)boot my system
>Fix:
Workaround is to specify fsck_flags="-l 0" in /etc/rc.conf

>Unformatted:
 2017.12.03.00.00


Home | Main Index | Thread Index | Old Index