NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/53624 (dom0 freeze on domU exit) is still there
The following reply was made to PR kern/53624; it has been noted by GNATS.
From: Manuel.Bouyer%lip6.fr@localhost
To: gnats-bugs%NetBSD.org@localhost
Cc:
Subject: kern/53624 (dom0 freeze on domU exit) is still there
Date: Wed, 18 Sep 2019 16:54:56 +0200 (MEST)
>Submitter-Id: net
>Originator: Manuel Bouyer
>Organization:
>Confidential: no
>Synopsis: kern/53624 (dom0 freeze on domU exit) is still there
>Severity: serious
>Priority: high
>Category: kern
>Class: sw-bug
>Release: NetBSD 8.1_STABLE
>Environment:
System: NetBSD xen1.soc.lip6.fr 8.1_STABLE NetBSD 8.1_STABLE (ADMIN_DOM0) #0: Tue Sep 17 15:47:43 MEST 2019 bouyer%armandeche.soc.lip6.fr@localhost:/local/armandeche1/tmp/build/amd64/obj/local/armandeche1/netbsd-8/src/sys/arch/amd64/compile/ADMIN_DOM0 x86_64
Architecture: x86_64
Machine: amd64
>Description:
On my testbed, which starts/destroys several domUs per day (eventually
in parallel), I see occasional filesystem hang with processes
waiting on fstchg.
Interesting processes are:
0 105 3 0 200 ffffa0000213e5a0 vnd1 fstchg
0 104 3 0 200 ffffa00002088160 vnd0 vndbp
0 97 3 0 200 ffffa0000206a980 vnd3 vndbp
0 96 3 0 200 ffffa0000105a280 vnd2 fstchg
0 67 3 0 200 ffffa00000d73640 ioflush fstchg
6533 1 3 0 0 ffffa00001f77080 vnconfig biowait
25777 1 3 0 80 ffffa00001e5f480 vnconfig fstcnt
db> tr/a ffffa0000213e5a0
trace: pid 0 lid 105 at 0xffffa0002cffd4f0
sleepq_block() at netbsd:sleepq_block+0x99
cv_wait() at netbsd:cv_wait+0xf0
fstrans_start() at netbsd:fstrans_start+0x78e
VOP_STRATEGY() at netbsd:VOP_STRATEGY+0x42
genfs_getpages() at netbsd:genfs_getpages+0x1344
VOP_GETPAGES() at netbsd:VOP_GETPAGES+0x4b
ubc_fault() at netbsd:ubc_fault+0x188
uvm_fault_internal() at netbsd:uvm_fault_internal+0x6d4
trap() at netbsd:trap+0x3c1
--- trap (number 6) ---
kcopy() at netbsd:kcopy+0x15
uiomove() at netbsd:uiomove+0xb9
ubc_uiomove() at netbsd:ubc_uiomove+0xf7
ffs_read() at netbsd:ffs_read+0xf7
VOP_READ() at netbsd:VOP_READ+0x33
vn_rdwr() at netbsd:vn_rdwr+0x10c
vndthread() at netbsd:vndthread+0x5b1
db> tr/a ffffa0000105a280
trace: pid 0 lid 96 at 0xffffa0002cf4d9c0
sleepq_block() at netbsd:sleepq_block+0x99
cv_wait() at netbsd:cv_wait+0xf0
fstrans_start() at netbsd:fstrans_start+0x78e
VOP_STRATEGY() at netbsd:VOP_STRATEGY+0x42
genfs_do_io() at netbsd:genfs_do_io+0x1b4
genfs_gop_write() at netbsd:genfs_gop_write+0x52
genfs_do_putpages() at netbsd:genfs_do_putpages+0xb9c
VOP_PUTPAGES() at netbsd:VOP_PUTPAGES+0x36
vndthread() at netbsd:vndthread+0x683
db> tr/a ffffa00000d73640
trace: pid 0 lid 67 at 0xffffa0002cd48ca0
sleepq_block() at netbsd:sleepq_block+0x99
cv_wait() at netbsd:cv_wait+0xf0
fstrans_start() at netbsd:fstrans_start+0x78e
VOP_BWRITE() at netbsd:VOP_BWRITE+0x42
ffs_sbupdate() at netbsd:ffs_sbupdate+0xc3
ffs_cgupdate() at netbsd:ffs_cgupdate+0x20
ffs_sync() at netbsd:ffs_sync+0x1e9
sched_sync() at netbsd:sched_sync+0x93
db> tr/a ffffa00001f77080
trace: pid 6533 lid 1 at 0xffffa0002cff8910
sleepq_block() at netbsd:sleepq_block+0x99
cv_wait() at netbsd:cv_wait+0xf0
biowait() at netbsd:biowait+0x4f
scan_iso_vrs_session() at netbsd:scan_iso_vrs_session+0x60
readdisklabel() at netbsd:readdisklabel+0x304
vndopen() at netbsd:vndopen+0x305
spec_open() at netbsd:spec_open+0x385
VOP_OPEN() at netbsd:VOP_OPEN+0x2f
vn_open() at netbsd:vn_open+0x1e9
do_open() at netbsd:do_open+0x112
do_sys_openat() at netbsd:do_sys_openat+0x68
sys_open() at netbsd:sys_open+0x24
syscall() at netbsd:syscall+0x9c
db> tr/a ffffa00001e5f480
trace: pid 25777 lid 1 at 0xffffa0002b358860
sleepq_block() at netbsd:sleepq_block+0x99
cv_wait_sig() at netbsd:cv_wait_sig+0xf4
fstrans_setstate() at netbsd:fstrans_setstate+0xaa
genfs_suspendctl() at netbsd:genfs_suspendctl+0x57
vfs_suspend() at netbsd:vfs_suspend+0x5b
vrevoke_suspend_next() at netbsd:vrevoke_suspend_next+0x2a
vrevoke() at netbsd:vrevoke+0x2b
genfs_revoke() at netbsd:genfs_revoke+0x13
VOP_REVOKE() at netbsd:VOP_REVOKE+0x2e
vdevgone() at netbsd:vdevgone+0x5a
vnddoclear() at netbsd:vnddoclear+0xc6
vndioctl() at netbsd:vndioctl+0x3bb
VOP_IOCTL() at netbsd:VOP_IOCTL+0x37
vn_ioctl() at netbsd:vn_ioctl+0xa6
sys_ioctl() at netbsd:sys_ioctl+0x101
syscall() at netbsd:syscall+0x9c
db> call fstrans_dump
Fstrans locks by lwp:
6533.1 (/) shared 1 cow 0
0.105 (/domains) lazy 3 cow 0
0.96 (/domains) lazy 2 cow 0
0.67 (/domains) shared 1 cow 0
Fstrans state by mount:
/ state suspending
So it looks like we have a 3-way deadlock between ioflush and the two vnconfig
threads (while kern/53624 was only between 2 vnconfig threads) but I can't
see the exact scenario yet. Also, the files backing the vnd are in
/domains, not in /
WAPBL is configured in the kernel but not in use.
>How-To-Repeat:
xl create/shutdown several domUs in parallel
>Fix:
please ...
Home |
Main Index |
Thread Index |
Old Index