tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Xen dom0 freeze after domU exits (was Re: Zombie kernel thread)



Hello

I follow up on my previous report: On XEN3_DOM0 kernel I very often freeze the
kernel after a domU exits. 

It seems related to vnconfig destroying the domU vnd backend, with vnconfig
process stuck in an I/O inside readdisklabel(). The backtrace inside
readdisklabel() may be a bit different. Last time it was
biowait/scan_mbr/readdisklabel, this time it is
biowait/convertdisklabel/validate_label/readdisklabel

Any hint how to debug this? A system freeze means there is some thread entered a
spl() and waits there before a splx(), is that correct?

And do I have a way to figure if the readdisklabel is on / or on the vnd device?

Today's sample: 

db> ps
PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
11914    1 3   0         0   ffffa0000163a520               cron fstchg
12921    1 3   0         0   ffffa00001a43300               cron fstchg
12560    1 3   0         0   ffffa00001386aa0               cron wait
17842    1 3   0         0   ffffa0000163a100               cron wait
9771     1 3   0        80   ffffa00001362a80           vnconfig fstcnt
20085    1 3   0         0   ffffa000019fe2c0           vnconfig biowait
20456    1 3   0        80   ffffa00001c71460                 sh wait
20237    1 3   0        80   ffffa00001a8cb80                 sh wait
15768    1 3   0         0   ffffa00001c7b8a0                ksh fstchg
16108    1 3   0        80   ffffa00001c71880                 su wait
15238    1 4   0   1000000   ffffa00001362660                ksh
18335    1 4   0   1000000   ffffa0000232f080                 su
22885    1 3   0        80   ffffa00001a1d2e0                ksh pause
3572     1 3   0        80   ffffa00001a97780               sshd select
18875    1 3   0        80   ffffa00001a58b60               sshd select
10208    1 3   0         0   ffffa000010b2a60              getty fstchg
559      2 3   0        80   ffffa00002f5d180                 xl netio
559      1 3   0        80   ffffa00002f5d9c0                 xl select
11784    2 3   0        80   ffffa00001c71040                 xl netio
11784    1 3   0         0   ffffa00001a58320                 xl fstchg
892      1 3   0        80   ffffa000010b2640              getty ttyraw
824      1 3   0        80   ffffa0000108c200              getty ttyraw
868      1 3   0        80   ffffa000010b2220              getty ttyraw
857      1 3   0         0   ffffa0000232f4a0               cron fstchg
982      1 3   0        80   ffffa00001adb440              inetd kqueue
722      1 3   0   1000000   ffffa00001c7b480               qmgr fstchg
887      1 3   0   1000000   ffffa0000232f8c0             master fstchg
113      1 3   0        80   ffffa00001adb020               sshd select
644      1 3   0        80   ffffa00001aa9000             powerd kqueue
540      3 3   0         0   ffffa00001aa9420              pcscd fstchg
540      2 3   0        80   ffffa00001aa9840              pcscd pipe_rd
540      1 3   0        80   ffffa00001a97360              pcscd select
533      1 3   0        80   ffffa00001a8c760               ntpd pause
350      2 3   0        80   ffffa00001386260        xenconsoled netio
350      1 3   0        80   ffffa00001362240        xenconsoled select
263      1 3   0        80   ffffa0000160c6a0          xenstored select
204      1 3   0         0   ffffa0000160cac0            syslogd fstchg
1        1 3   0        80   ffffa00000edf1a0               init wait
0       85 3   0       200   ffffa00002f3d160            xbdb6i3 xbdb6i3
0       84 3   0       200   ffffa00000865080               vnd0 fstchg
0       82 5   0       200   ffffa00001a1db20           (zombie)
0       81 3   0       200   ffffa00001c7b060               vnd2 vndbp
0       80 3   0       200   ffffa00001a97ba0               vnd1 fstchg
0       64 3   0       200   ffffa00001386680        xen_balloon xen_balloon
0       63 3   0       200   ffffa0000160c280       bridge_rtage bridge_rtage
0       62 3   0       200   ffffa0000108b600            physiod physiod
0       61 3   0       200   ffffa0000108c620           aiodoned aiodoned
0       60 3   0       200   ffffa0000108ca40            ioflush syncer
0       59 3   0       200   ffffa0000108b1e0           pgdaemon pgdaemon
0       56 3   0       200   ffffa00000fa91c0            raidio0 raidiow
0       55 3   0       200   ffffa00000fa95e0              raid0 rfnodeq
0       51 3   0       200   ffffa00000ebb540           scsibus0 sccomp
0       50 3   0       200   ffffa00000ebb960               usb1 usbevt
0       49 3   0       200   ffffa00000fa9a00               usb0 usbevt
0       48 3   0       200   ffffa00000edf5c0            rt_free rt_free
0       47 3   0       200   ffffa00000edf9e0              unpgc unpgc
0       46 3   0       200   ffffa00000ebb120    key_timehandler key_timehandler
0       45 3   0       200   ffffa00000ebc980    icmp6_wqinput/0 icmp6_wqinput
0       44 3   0       200   ffffa00000ebc560          nd6_timer nd6_timer
0       43 3   0       200   ffffa00000ebc140     icmp_wqinput/0 icmp_wqinput
0       42 3   0       200   ffffa00000ebd9a0           rt_timer rt_timer
0       41 3   0       200   ffffa00000ebd580        vmem_rehash vmem_rehash
0       40 3   0       200   ffffa00000ec0180             xenbus rdst
0       39 3   0       200   ffffa00000ec05a0           xenwatch evtsq
0       38 3   0       200   ffffa00000ec09c0            acpitz1 acpitz1
0       37 3   0       200   ffffa00000ebd160            acpitz0 acpitz0
0       28 3   0       200   ffffa00000c0d100               iic0 iicintr
0       27 3   0       280   ffffa00000c0d520              spkr0 bellcv
0       26 3   0       280   ffffa00000c0d940           audiomix play
0       25 3   0       280   ffffa00000be90e0           audiorec record
0       24 3   0       200   ffffa00000be9500            atabus5 atath
0       23 3   0       200   ffffa00000be9920            atabus4 atath
0       22 3   0       200   ffffa00000bdb0c0            atabus3 atath
0       21 3   0       200   ffffa00000bdb4e0            atabus2 atath
0       20 3   0       200   ffffa00000bdb900            atabus1 atath
0       19 3   0       200   ffffa00000bd20a0            atabus0 atath
0       18 3   0       200   ffffa00000bd24c0         usbtask-dr usbtsk
0       17 3   0       200   ffffa00000bd28e0         usbtask-hc usbtsk
0       15 3   0       200   ffffa000008654a0             sysmon smtaskq
0       14 3   0       200   ffffa000008658c0         pmfsuspend pmfsuspend
0       13 3   0       200   ffffa0000085f060           pmfevent pmfevent
0       12 3   0       200   ffffa0000085f480         sopendfree sopendfr
0       11 3   0       200   ffffa0000085f8a0           nfssilly nfssilly
0       10 3   0       200   ffffa00000725040            cachegc cachegc
0        9 3   0       200   ffffa00000725460             vdrain vdrain
0        8 3   0       200   ffffa00000725880          modunload mod_unld
0        7 3   0       200   ffffa0000071c020            xcall/0 xcall
0        6 1   0       200   ffffa0000071c440          softser/0
0        5 1   0       200   ffffa0000071c860          softclk/0
0        4 1   0       200   ffffa0000071a000          softbio/0
0        3 1   0       200   ffffa0000071a420          softnet/0
0    >   2 7   0       201   ffffa0000071a840             idle/0
0        1 3   0       200   ffffffff80d5bb00            swapper uvm

db> call fstrans_dump
Fstrans locks by lwp:
20085.1  (/) shared 1 cow 0
Fstrans state by mount:
/                state suspended
0


db> bt/a ffffa000019fe2c0
trace: pid 20085 lid 1 at 0xffffa0002d1bb910
sleepq_block() at netbsd:sleepq_block+0x99
cv_wait() at netbsd:cv_wait+0x90
biowait() at netbsd:biowait+0x38
convertdisklabel() at netbsd:convertdisklabel+0x125
validate_label() at netbsd:validate_label+0x16c
readdisklabel() at netbsd:readdisklabel+0x1bc
vndopen() at netbsd:vndopen+0x2db
spec_open() at netbsd:spec_open+0x385
VOP_OPEN() at netbsd:VOP_OPEN+0x2f
vn_open() at netbsd:vn_open+0x1e9
do_open() at netbsd:do_open+0x112
do_sys_openat() at netbsd:do_sys_openat+0x68
sys_open() at netbsd:sys_open+0x24
syscall() at netbsd:syscall+0x9c


db> bt/a  ffffa00001362a80
trace: pid 9771 lid 1 at 0xffffa0002ce33860
sleepq_block() at netbsd:sleepq_block+0x99
cv_wait_sig() at netbsd:cv_wait_sig+0x93
fstrans_setstate() at netbsd:fstrans_setstate+0x89
genfs_suspendctl() at netbsd:genfs_suspendctl+0x53
vfs_suspend() at netbsd:vfs_suspend+0x5b
vrevoke_suspend_next.part.1() at netbsd:vrevoke_suspend_next.part.1+0x16
vrevoke() at netbsd:vrevoke+0x27
genfs_revoke() at netbsd:genfs_revoke+0xd
VOP_REVOKE() at netbsd:VOP_REVOKE+0x2e
vdevgone() at netbsd:vdevgone+0x5a
vnddoclear() at netbsd:vnddoclear+0xb9
vndioctl() at netbsd:vndioctl+0x363
VOP_IOCTL() at netbsd:VOP_IOCTL+0x37
vn_ioctl() at netbsd:vn_ioctl+0xa6
sys_ioctl() at netbsd:sys_ioctl+0x101
syscall() at netbsd:syscall+0x9c




-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu%netbsd.org@localhost


Home | Main Index | Thread Index | Old Index