Subject: kern/32287: Processes hang in "mclpl"
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Christian Biere <christianbiere@gmx.de>
List: netbsd-bugs
Date: 12/12/2005 19:55:02
>Number:         32287
>Category:       kern
>Synopsis:       Processes hang in "mclpl"
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Dec 12 19:55:01 +0000 2005
>Originator:     Christian Biere
>Release:        NetBSD 3.99.13
>Environment:
System: NetBSD cyclonus 3.99.13 NetBSD 3.99.13 (STARSCREAM) #5: Thu Dec 1 02:47:07 CET 2005 bin@cyclonus:/o/NetBSD/obj/sys/arch/i386/compile/STARSCREAM i386
Architecture: i386
Machine: i386
>Description:

Recently, my machine often ends up with invincible processes hanging
in "mclpl". While it usually starts with a single process, after a
moment all network-using processes (including local X apps) hang
as well in "mclpl". When I shutdown from the console the system panics
but doesn't reboot as expected. For what it's worth, ddb is normally
disabled on the console but enabled during shutdown because the
system often hangs during shutdowns which seems to be related to
using a mfs-based /dev.

After a reboot, I get thousands of these:
pmap_unwire: wiring for pmap 0xca209c1c va 0x843a000 didn't change!
pmap_unwire: wiring for pmap 0xca209c1c va 0x844a000 didn't change!

I don't get "mclpool limit reached" or any other messages in
/var/log/messages so it doesn't look as if increasing NMBCLUSTERS
would help.

(gdb) target kcore netbsd.54.core
panic: trap
#0  0x0fef0000 in ?? ()
(gdb) bt full
#0  0x0fef0000 in ?? ()
No symbol table info available.
#1  0xc02847d7 in cpu_reboot (howto=256, bootstr=0x0)
    at /s/NetBSD/src/sys/arch/i386/i386/machdep.c:752
	howto = 1
#2  0xc016438c in db_sync_cmd (addr=-1070538560, have_addr=0, 
    count=-1070869818, modif=0xcaf27044 "@O9À[pòÊ\001")
    at /s/NetBSD/src/sys/ddb/db_command.c:785
No locals.
#3  0xc0163ccc in db_command (last_cmdp=0xc03743a4, cmd_table=0xc030e0c0)
    at /s/NetBSD/src/sys/ddb/db_command.c:499
	cmd_table = (const struct db_command *) 0x0
	cmd = (const struct db_command *) 0xc030e260
	t = 0
	modif = "@O9À[pòÊ\001\0\0\0\001\0\0\0à\08Àä\08\r|pòÊäö(À\0/\0\0\r\0\0\0n\08À\003\0\0\0\n\0\0\0\n\0\0\0\234pòÊ:`\026À\n\0\0\0\001\0\0\0\234pòÊ\232X\026À `\0\0à\08À¼pòÊjX\026À\n\0\0\0Àà0À¼pòÊ-X\026À\0\0\0\0\0\0\0"
	addr = -1070538560
	count = -1070869818
	have_addr = 4096
	result = -1070538560
	last_count = -1070869818
#4  0xc01639a6 in db_command_loop () at /s/NetBSD/src/sys/ddb/db_command.c:290
	db_jmpbuf = {val = {0, -890081056, -890080964, 0, 1, -1072285368}}
	savejmp = (label_t *) 0x0
#5  0xc016711c in db_trap (type=1, code=0)
    at /s/NetBSD/src/sys/ddb/db_trap.c:101
	type = 0
	bkpt = 0
	watchpt = 0
#6  0xc0281c28 in kdb_trap (type=1, code=0, regs=0xcaf27288)
    at /s/NetBSD/src/sys/arch/i386/i386/db_interface.c:226
	regs = (db_regs_t *) 0x0
	flags = 0
	dbreg = {tf_gs = -890080860, tf_fs = -1145559025, tf_es = 1, 
  tf_ds = -890080748, tf_edi = -890080744, tf_esi = -890080808, 
  tf_ebp = -1071111976, tf_ebx = -1077940856, tf_edx = -1145559025, 
  tf_ecx = 0, tf_eax = 0, tf_trapno = -890080824, tf_err = 0, 
  tf_eip = -1069701583, tf_cs = -890080752, tf_eflags = 0, 
  tf_esp = -1069701583, tf_ss = 423, tf_vm86_es = -1071063929, tf_vm86_ds = 0, 
  tf_vm86_fs = -1071679264, tf_vm86_gs = -890080632}
#7  0xc028e171 in trap (frame=0xcaf27288)
    at /s/NetBSD/src/sys/arch/i386/i386/trap.c:310
	l = (struct lwp *) 0xcaf27288
	p = (struct proc *) 0xcad21398
	type = 1
	pcb = (struct pcb *) 0xcaf24000
	vframe = (struct trapframe *) 0xc030e0c0
	ksi = {ksi_flags = 1868652397, ksi_list = {cqe_next = 0x632e6a62, 
    cqe_prev = 0x3034313a}, ksi_info = {_signo = 2616, _code = 0, _errno = 0, 
    _reason = {_rt = {_pid = 0, _uid = 0, _sigval = {sival_int = 0, 
          sival_ptr = 0x0}}, _child = {_pid = 0, _uid = 0, _status = 0, 
        _utime = 0, _stime = 0}, _fault = {_addr = 0x0, _trap = 0}, _poll = {
        _band = 0, _fd = 0}}}}
	resume = 0
	onfault = 0x2 <Address 0x2 out of bounds>
	error = 0
	cr2 = 423
#8  0xc010af51 in calltrap ()
No symbol table info available.
#9  0xc021a5e2 in pool_get (pp=0xc038da20, flags=2)
    at /s/NetBSD/src/sys/kern/subr_pool.c:816
	pp = (struct pool *) 0xc038da20
	ph = (struct pool_item_header *) 0x0
	v = (void *) 0xc09cbc00
#10 0xc01bccbb in uvm_mapent_alloc (map=0xc09cbc00, flags=0)
    at /s/NetBSD/src/sys/uvm/uvm_map.c:463
	map = (struct vm_map *) 0x0
	flags = 0
	me = (struct vm_map_entry *) 0xc030e0c0
	pflags = -1070538560
#11 0xc01bd461 in uvm_map (map=0xc09cbc00, startp=0xcaf273a8, size=12288, 
    uobj=0x0, uoffset=-1, align=0, flags=262144)
    at /s/NetBSD/src/sys/uvm/uvm_map.c:807
	map = (struct vm_map *) 0xc09cbc00
	uoffset = -1
	flags = 262144
	args = {uma_prev = 0x1000, uma_start = 3365720064, uma_size = 0, 
  uma_uobj = 0x246, uma_uoffset = 7519946560, uma_flags = 0}
	new_entry = (struct vm_map_entry *) 0x0
	error = 262144
#12 0xc01c7da9 in uvm_pagermapin (pps=0xcaf274a0, npages=3, flags=1)
    at /s/NetBSD/src/sys/uvm/uvm_pager.c:155
	npages = 3
	flags = 1
	size = 12288
	kva = 0
	cva = 1
	pp = (struct vm_page *) 0xc030e0c0
	prot = 1
	kva = 0
	prot = 1
#13 0xc0258057 in genfs_gop_write (vp=0xcadc61fc, pgs=0xcaf274a0, npages=3, 
    flags=1) at /s/NetBSD/src/sys/miscfs/genfs/genfs_vnops.c:1470
	npages = 3
	error = 0
	run = 0
	fs_bshift = 14
	dev_bshift = 9
	kva = 61440
	eof = 61440
	offset = 12884901888
	startoffset = 49152
	bytes = 12288
	iobytes = 61440
	skipbytes = 0
	lbn = 7525402652
	blkno = 263886181807268
	pg = (struct vm_page *) 0xc06c6e38
	mbp = (struct buf *) 0x87
	bp = (struct buf *) 0xcaf274a0
	devvp = (struct vnode *) 0x2
	async = 1
#14 0xc0257a6e in genfs_putpages (v=0xcaf27690)
    at /s/NetBSD/src/sys/miscfs/genfs/genfs_vnops.c:1354
	v = (void *) 0xc030e0c0
	uobj = (struct uvm_object *) 0xcadc61fc
	startoff = 0
	endoff = 9223372036854771712
	off = 49152
	flags = 17
	i = 3
	error = 0
	npages = 3
	nback = 0
	freeflag = 32
	pgs = (struct vm_page *(*)[0]) 0xcaf274a0
	pg = (struct vm_page *) 0xc08c87e8
	nextpg = (struct vm_page *) 0x0
	tpg = (struct vm_page *) 0xcaf274a0
	curmp = {pageq = {tqe_next = 0xc0403000, tqe_prev = 0xc0403000}, 
  hashq = {tqe_next = 0x1, tqe_prev = 0xc0394f40}, listq = {
    tqe_next = 0xcaf27560, tqe_prev = 0xc08c87f8}, uanon = 0x18, 
  uobject = 0xcadc61fc, offset = -1, flags = 1, loan_count = 0, 
  wire_count = 9184, pqflags = 49318, phys_addr = 16384, mdpage = {
    mp_pvhead = {pvh_lock = {lock_data = 0, 
        lock_file = 0xbce <Address 0xbce out of bounds>, 
        unlock_file = 0xc0394f40 " B1ÀÀ\b8À\r", lock_line = 120, 
        unlock_line = 0, list = {tqe_next = 0xc0394f40, 
          tqe_prev = 0xc0403000}, lock_holder = 3224981312}, pvh_root = {
        sph_root = 0xc03808c0}}, mp_attrs = 24}}
	endmp = {pageq = {tqe_next = 0x3000, tqe_prev = 0x0}, hashq = {
    tqe_next = 0xcaf27598, tqe_prev = 0xc01c59d4}, listq = {tqe_next = 0x0, 
    tqe_prev = 0xcaf275d0}, uanon = 0x87, uobject = 0xcadc61fc, offset = -1, 
  flags = 1, loan_count = 51954, wire_count = 12288, pqflags = 0, 
  phys_addr = 12288, mdpage = {mp_pvhead = {pvh_lock = {lock_data = 0, 
        lock_file = 0xcaf275c8 "\001", 
        unlock_file = 0xc01ccebc "\203Ä\020\205À\211Ã\017\204×", 
        lock_line = 15152, unlock_line = -13413, list = {tqe_next = 0x3000, 
          tqe_prev = 0x0}, lock_holder = 0}, pvh_root = {sph_root = 0x1}}, 
    mp_attrs = -1063939840}}
	wasclean = 0
	by_list = 1
	needs_clean = 1
	yld = -890080096
	async = 1
	pagedaemon = 0
	l = (struct lwp *) 0xca20b4a4
	gp = (struct genfs_node *) 0xcadd29cc
	dirtygen = 2
	modified = 1
	cleanall = 1
#15 0xc01a7e56 in ffs_full_fsync (v=0xcaf27820)
    at /s/NetBSD/src/sys/sys/vnode_if.h:1671
	a = {a_desc = 0xc0319140, a_vp = 0xcadc61fc, a_offlo = 0, a_offhi = 0, 
  a_flags = 17}
	vp = (struct vnode *) 0xcadc61fc
	bp = (struct buf *) 0x0
	nbp = (struct buf *) 0xcaf27698
	s = 0
	error = 67112988
	passes = 4
	skipmeta = 0
	inodedeps_only = 0
	waitfor = -1070538560
	nbp = (struct buf *) 0xcaf27698
	s = 0
	passes = 4
	skipmeta = 0
	inodedeps_only = 0
#16 0xc01a75ac in ffs_fsync (v=0xcaf27820)
    at /s/NetBSD/src/sys/ufs/ffs/ffs_vnops.c:258
	bp = (struct buf *) 0x0
	s = 0
	num = 1
	error = 0
	i = -882164808
	ia = {{in_lbn = -4560052952387026628, in_off = -890079336, 
    in_exists = -1071307893}, {in_lbn = -3789206890304342144, 
    in_off = -890079304, in_exists = 582}, {in_lbn = -3829077844987638144, 
    in_off = 65554, in_exists = 65554}, {in_lbn = -3829077848212045806, 
    in_off = -890079304, in_exists = -1071353728}}
	bsize = -1070538560
	blk_high = 907
#17 0xc01a569f in ffs_sync (mp=0xc0b6d000, waitfor=2, cred=0xca20072c, 
    p=0xcad21398) at /s/NetBSD/src/sys/sys/vnode_if.h:739
	a = {a_desc = 0xc0318b40, a_vp = 0xcadc61fc, a_cred = 0xca20072c, 
  a_flags = 0, a_offlo = 0, a_offhi = 0, a_p = 0xcad21398}
	vp = (struct vnode *) 0xcadc61fc
	nvp = (struct vnode *) 0xcb1a5a20
	ip = (struct inode *) 0xc030e0c0
	ump = (struct ufsmount *) 0xc0ae1f00
	fs = <incomplete type>
	error = 0
	count = 65537
	allerror = 0
	ump = (struct ufsmount *) 0xc0ae1f00
	fs = <incomplete type>
	count = 65537
	allerror = 0
#18 0xc024ac06 in sys_sync (l=0xca20b4a4, v=0x0, retval=0x0)
    at /s/NetBSD/src/sys/kern/vfs_syscalls.c:653
	l = (struct lwp *) 0xc030e0c0
	mp = <incomplete type>
	nmp = <incomplete type>
	asyncflag = 0
	p = (struct proc *) 0xcad21398
	mp = <incomplete type>
#19 0xc0248c84 in vfs_shutdown () at /s/NetBSD/src/sys/kern/vfs_subr.c:2227
	l = (struct lwp *) 0xca20b4a4
	p = (struct proc *) 0xcad21398
#20 0xc02847eb in cpu_reboot (howto=256, bootstr=0x0)
    at /s/NetBSD/src/sys/arch/i386/i386/machdep.c:738
	howto = 256
#21 0xc021c9e8 in panic (fmt=0xc0335101 "trap")
    at /s/NetBSD/src/sys/kern/subr_prf.c:244
	fmt = 0x0
	bootopt = 256
#22 0xc028e1fb in trap (frame=0xcaf27984)
    at /s/NetBSD/src/sys/arch/i386/i386/trap.c:336
	l = (struct lwp *) 0x10282
	p = (struct proc *) 0xcad21398
	type = 6
	pcb = (struct pcb *) 0xcaf24000
	vframe = (struct trapframe *) 0xc0335101
	ksi = {ksi_flags = 1, ksi_list = {cqe_next = 0x0, cqe_prev = 0x0}, 
  ksi_info = {_signo = 0, _code = 1, _errno = 0, _reason = {_rt = {_pid = 44, 
        _uid = 6, _sigval = {sival_int = 0, sival_ptr = 0x0}}, _child = {
        _pid = 44, _uid = 6, _status = 0, _utime = 0, _stime = 0}, _fault = {
        _addr = 0x2c, _trap = 6}, _poll = {_band = 44, _fd = 6}}}}
	resume = 0
	onfault = 0x0
	error = 0
	cr2 = 44
#23 0xc010af51 in calltrap ()
No symbol table info available.
#24 0xc01b3ffa in uao_pagein (aobj=0xc0380b20, startslot=1, endslot=262145)
    at /s/NetBSD/src/sys/uvm/uvm_aobj.c:1347
	slot = -1070378751
	i = 0
	elt = (struct uao_swhash_elt *) 0x0
	buck = 0
	startslot = -1070068960
	endslot = 44644
	rv = -1070378751
#25 0xc01b3eea in uao_swap_off (startslot=1, endslot=262145)
    at /s/NetBSD/src/sys/uvm/uvm_aobj.c:1282
	endslot = 262145
	aobj = (struct uvm_aobj *) 0xc0380b20
	nextaobj = (struct uvm_aobj *) 0x1
	rv = 1
#26 0xc01cb6b1 in swap_off (p=0xcad21398, sdp=0xc0b0c300)
    at /s/NetBSD/src/sys/uvm/uvm_swap.c:983
	p = (struct proc *) 0xcad21398
	sdp = (struct swapdev *) 0xc0b0c300
	npages = 262143
	error = -892202088
#27 0xc01caf7c in sys_swapctl (l=0xca20b4a4, v=0xcaf27f64, retval=0xcaf27f5c)
    at /s/NetBSD/src/sys/uvm/uvm_swap.c:656
	l = (struct lwp *) 0xc0335101
	v = (void *) 0xc03504a0
	p = (struct proc *) 0xcad21398
	vp = (struct vnode *) 0xcaafb3c0
	nd = {ni_dirp = 0xbfbfeed7 <Address 0xbfbfeed7 out of bounds>, 
  ni_segflg = UIO_USERSPACE, ni_startdir = 0x0, ni_rootdir = 0xca20ccb0, 
  ni_vp = 0xcaafb3c0, ni_dvp = 0xcaafbea0, ni_pathlen = 1, 
  ni_next = 0xcb66e809 "", ni_loopcnt = 0, ni_cnd = {cn_nameiop = 0, 
    cn_flags = 1097796, cn_proc = 0xcad21398, cn_cred = 0xca20072c, 
    cn_pnbuf = 0xcb66e800 "/dev/wd0b", cn_nameptr = 0xcb66e805 "wd0b", 
    cn_namelen = 4, cn_hash = 2156200086, cn_consume = 0}}
	spp = (struct swappri *) 0x0
	sdp = (struct swapdev *) 0xc03504a0
	userpath = "\220Ö Ê\230\023ÒÊüzòÊ\212W%À\020Å Ê\006\0\0\0\210Ä Ê\0\2075À<\001\0\0Ø~òÊ\034{òÊWW%À\020Å Ê\002\0\001\0L{òÊM\234\032À${òÊ\0\0\0\0\0\0\0\0\002\0\002\0\210Ä Ê\0\0\0\0L{òÊF\002\0\0À\2161À\210Ä Êàò4ÀH5\222À\0°\004\0\0\0\0\0l{òÊÔY\034À\220Ù8Ààò4À\207\0\0\0\0°\004\0d{òÊ\002\0\001\0!\005\0\0\0°\004\0\0°\004\0\0\0\0\0\234{òʼÎ\034À\204\216!Ê\0°\004\0\0\0\0\0\0\0\0\0\210Ä ÊÀ{òÊ"...
	len = 3391145096
	error = 0
	misc = -903870676
	priority = 4975
	vp = (struct vnode *) 0xcaafb3c0
	nd = {ni_dirp = 0xbfbfeed7 <Address 0xbfbfeed7 out of bounds>, 
  ni_segflg = UIO_USERSPACE, ni_startdir = 0x0, ni_rootdir = 0xca20ccb0, 
  ni_vp = 0xcaafb3c0, ni_dvp = 0xcaafbea0, ni_pathlen = 1, 
  ni_next = 0xcb66e809 "", ni_loopcnt = 0, ni_cnd = {cn_nameiop = 0, 
    cn_flags = 1097796, cn_proc = 0xcad21398, cn_cred = 0xca20072c, 
    cn_pnbuf = 0xcb66e800 "/dev/wd0b", cn_nameptr = 0xcb66e805 "wd0b", 
    cn_namelen = 4, cn_hash = 2156200086, cn_consume = 0}}
	spp = (struct swappri *) 0x0
	userpath = "\220Ö Ê\230\023ÒÊüzòÊ\212W%À\020Å Ê\006\0\0\0\210Ä Ê\0\2075À<\001\0\0Ø~òÊ\034{òÊWW%À\020Å Ê\002\0\001\0L{òÊM\234\032À${òÊ\0\0\0\0\0\0\0\0\002\0\002\0\210Ä Ê\0\0\0\0L{òÊF\002\0\0À\2161À\210Ä Êàò4ÀH5\222À\0°\004\0\0\0\0\0l{òÊÔY\034À\220Ù8Ààò4À\207\0\0\0\0°\004\0d{òÊ\002\0\001\0!\005\0\0\0°\004\0\0°\004\0\0\0\0\0\234{òʼÎ\034À\204\216!Ê\0°\004\0\0\0\0\0\0\0\0\0\210Ä ÊÀ{òÊ"...
	len = 3391145096
	error = 0
	priority = 4975
#28 0xc028dc87 in syscall_plain (frame=0xcaf27fa8)
    at /s/NetBSD/src/sys/arch/i386/i386/syscall.c:160
	params = 0x0
	callp = (const struct sysent *) 0xc03768f4
	l = (struct lwp *) 0xca20b4a4
	p = (struct proc *) 0xc0335101
	error = -1070378751
	argsize = 3224588545
	code = 271
	args = {2, -1077940521, 0, 4114, -1, 0, 0, 0}
	rval = {0, 0}
(gdb) quit

>How-To-Repeat:

I don't know whether it's application-specific i.e., whether any kind
of server could cause the same but I can reproduce this problem
almost instantly with net/gtk-gnutella from pkgsrc when it reaches
about 100 TCP connections.

>Fix: