tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Zombie kernel thread



Emmanuel Dreyfus <manu%netbsd.org@localhost> wrote:

> Another NetBSD 8.0 related problem: on an amd64 xen DOM0, system hangs,
> and I find this unusual thing:

I observed this problem on NetBSD-8.0/i386 GENERIC and NetBSD-8.0/amd64
XEN3_DOM0. In the meantime, I discovered the zombie kernel thread can be
observed without crashing the machine. 

While a domU is running:
PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
0       79 3   0       200   ffffa00002597480            xbdb5i5 xbdb5i5
0       78 3   0       200   ffffa00001a5b720            xbdb5i3 xbdb5i3
0       77 3   0       200   ffffa00001c42860               vnd2 vndbp
0       76 3   0       200   ffffa000019d72a0               vnd1 vndbp
0       66 3   0       200   ffffa00001ac0780            xbdb1i3 xbdb1i3
0       65 3   0       200   ffffa00000865080               vnd0 vndbp

After shutting down the domU:
PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
0       76 5   0       200   ffffa000019d72a0           (zombie)
0       66 3   0       200   ffffa00001ac0780            xbdb1i3 xbdb1i3
0       65 3   0       200   ffffa00000865080               vnd0 vndbp

This tells us it was the vnd1 thread. Restarting the domU will make it
disapear and everything is back to normal, except if the domU freeze, in
which case I have:

PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
0       82 5   0       200   ffffa0000160dac0           (zombie)
0       81 3   0       200   ffffa00001c42860               vnd2 fstchg
0       80 3   0       200   ffffa000019d72a0               vnd1 vndbp
0       66 3   0       200   ffffa00001ac0780            xbdb1i3 xbdb1i3
0       65 3   0       200   ffffa00000865080               vnd0 vndbp

db> bt/a ffffa000019d72a0
trace: pid 0 lid 80 at 0xffffa0002d7afe30
sleepq_block() at netbsd:sleepq_block+0x99
vndthread() at netbsd:vndthread+0x548

db> bt/a ffffa00001c42860
trace: pid 0 lid 81 at 0xffffa0002d4a7b00
sleepq_block() at netbsd:sleepq_block+0x99
cv_wait() at netbsd:cv_wait+0x90
fstrans_start() at netbsd:fstrans_start+0x5a9
genfs_do_putpages() at netbsd:genfs_do_putpages+0xc38
VOP_PUTPAGES() at netbsd:VOP_PUTPAGES+0x36
vndthread() at netbsd:vndthread+0x489

db> call fstrans_dump
Fstrans locks by lwp:
2942.1   (/) shared 1 cow 0
Fstrans state by mount:
/                state suspended
0

PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
2942     1 3   0         0   ffffa00001361a80           vnconfig biowait

db> bt/a ffffa00001361a80
trace: pid 2942 lid 1 at 0xffffa0002e4218d0
sleepq_block() at netbsd:sleepq_block+0x99
cv_wait() at netbsd:cv_wait+0x90
biowait() at netbsd:biowait+0x38
scan_mbr() at netbsd:scan_mbr+0x3d
readdisklabel() at netbsd:readdisklabel+0x14b
vndopen() at netbsd:vndopen+0x2db
spec_open() at netbsd:spec_open+0x385
VOP_OPEN() at netbsd:VOP_OPEN+0x2f
vn_open() at netbsd:vn_open+0x1e9
do_open() at netbsd:do_open+0x112
do_sys_openat() at netbsd:do_sys_openat+0x68
sys_open() at netbsd:sys_open+0x24
syscall() at netbsd:syscall+0x9c

A never-ending I/O on MBR read? What does that mean? I understand this
is the MBR of the domU disk image, which does not exist: I use a
disklabel without MBR. 




-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu%netbsd.org@localhost


Home | Main Index | Thread Index | Old Index