Subject: kern/16067: [dM] shared libraries deadlock
To: None <gnats-bugs@gnats.netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: netbsd-bugs
Date: 03/26/2002 11:10:14
>Number: 16067
>Category: kern
>Synopsis: [dM] Deadlock when loading shared libraries
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Mar 26 08:12:00 PST 2002
>Closed-Date:
>Last-Modified:
>Originator: der Mouse
>Release: Proprietary PPC port derived from 1.5W
>Organization:
Dis-
>Environment:
Proprietary PPC port derived from 1.5W
>Description:
I've been asked to post this by a company that's been working
with NetBSD. I don't know much about the problem beyond what's
here, as I did not see this myself, but I can get mail back to
the actual originators. (I realize this may seem to be a
rather disorganized collection of information; it's what they
sent me, and as I say, I didn't see it happen myself. I'm not
entirely sure why they asked me to post it instead of doing it
themselves. I'm trying to help them some in tracking it down
myself; if anyone has any ideas, I/we would most appreciate
hearing them.)
Symptom: deadlock situation when running script loading applications linked w/
shared libs
Proprietary NetBSD PPC port derived from the 1.5W
/* $NetBSD: param.h,v 1.128 2001/06/03 02:48:45 thorpej Exp $ */
#define __NetBSD_Version__ 105230000 /* NetBSD 1.5W */
/* $NetBSD: sys_process.c,v 1.67 2001/03/17 09:38:36 pooka Exp $ */
/* $NetBSD: kern_synch.c,v 1.104 2001/05/28 22:20:03 chs Exp $ */
/* $NetBSD: kern_lock.c,v 1.55 2001/06/05 04:38:09 thorpej Exp $ */
/* $NetBSD: uvm_fault_i.h,v 1.13 2001/06/02 18:09:26 chs Exp $ */
/* $NetBSD: uvm_map.h,v 1.28 2001/06/02 18:09:27 chs Exp $ */
/* $NetBSD: uvm_fault.c,v 1.64 2001/06/02 18:09:26 chs Exp $ */
/* $NetBSD: uvm_map.c,v 1.99 2001/06/02 18:09:26 chs Exp $ */
/* $NetBSD: uvm_io.c,v 1.15 2001/06/02 18:09:26 chs Exp $ */
/* $NetBSD: uvm_vnode.c,v 1.50 2001/05/26 21:27:21 chs Exp $ */
/* $NetBSD: procfs_mem.c,v 1.27 2000/11/24 18:58:37 chs Exp $ */
/* $NetBSD: layer_vnops.c,v 1.6 2001/06/07 13:32:47 wiz Exp $ */
In platform-dependent part: arch/my_ppc
/* $NetBSD: cpu.c,v 1.1 2000/02/29 15:21:46 nonaka Exp $ */
/* $NetBSD: locore.s,v 1.8 2000/11/16 05:38:33 thorpej Exp $ */
/* $NetBSD: machdep.c,v 1.11 2000/09/13 15:00:22 thorpej Exp $ */
in arch/powerpc
/* $NetBSD: Locore.c,v 1.4 2000/06/08 06:48:45 kleink Exp $ */
/* $NetBSD: locore_subr.S,v 1.2 2001/02/28 20:44:41 tsubai Exp $ */
/* $NetBSD: mem.c,v 1.9 2001/02/04 17:38:11 briggs Exp $ */
/* $NetBSD: pmap.c,v 1.44 2001/06/10 11:01:27 tsubai Exp $ */
/* $NetBSD: powerpc_machdep.c,v 1.4 2001/04/05 09:58:05 tsubai Exp $ */
/* $NetBSD: process_machdep.c,v 1.5 2001/02/04 17:38:11 briggs Exp $ */
/* $NetBSD: sys_machdep.c,v 1.3 2000/06/09 14:08:45 kleink Exp $ */
/* $NetBSD: trap.c,v 1.46 2001/06/10 16:31:59 tsubai Exp $ */
/* $NetBSD: trap_subr.S,v 1.6 2001/06/08 00:16:25 matt Exp $ */
/* $NetBSD: trap_subr_mp.S,v 1.2 2001/06/10 11:09:28 tsubai Exp $ */
/* $NetBSD: vm_machdep.c,v 1.28 2001/06/10 11:01:28 tsubai Exp $ */
db> ps /a
PID COMMAND STRUCT PROC * UAREA * VMSPACE/VM_MAP
87 my_asynctst 0x5c33570 0xefbd5000 0x324bd48
85 my_snmpsub 0x46d1c88 0xefbc2000 0x324ba28
83 crunch 0x5c331e0 0xefbcd000 0x324bbb8
82 crunch 0x5c33018 0xefbc8000 0x324baf0
80 my_snmptst 0x46d1730 0xefbb3000 0x324b7d0
79 syslogd 0x46d1ac0 0xefbbe000 0x324b960
77 my_xtest 0x46d18f8 0xefbb9000 0x324b898
74 my_xp 0x46d1568 0xefbaf000 0x324b708
72 my_event_tst 0x46d11d8 0xefba5000 0x324b3e8
68 my_xp 0x46d1010 0xefba1000 0x324b640
64 crunch 0x3277728 0xefb8f000 0x324b190
56 crunch 0x3277c80 0xefb9d000 0x324b258
49 crunch 0x32778f0 0xefb98000 0x324b578
33 crunch 0x3277ab8 0xefb94000 0x324b320
12 crunch 0x3277560 0xefb8b000 0x324b0c8
11 aiodoned 0x3277398 0xefb86000 0x159680
10 ioflush 0x32771d0 0xefb82000 0x159680
9 reaper 0x3277008 0xefb7e000 0x159680
8 pagedaemon 0x3246c78 0xefb7a000 0x159680
7 nfsio 0x3246ab0 0xefb76000 0x159680
6 nfsio 0x32468e8 0xefb72000 0x159680
5 nfsio 0x3246720 0xefb6e000 0x159680
4 nfsio 0x3246558 0xefb6a000 0x159680
1 crunch 0x3246000 0xefb5e000 0x324b000
0 swapper 0x159748 0x1bf000 0x159680
88 crunch 0x5c33738 0xefbd9000 0x324be10
db> ps /w
PID COMMAND EMUL PRI UTIME STIME WAIT-MSG WAIT-CHANNEL
87 my_asynctst netbsd 24 0.1 0.2 select selwait
85 my_snmpsub netbsd 4 2.0 2.1 vmmaplk kernel_map_store+0x4
83 crunch netbsd 4 0.0 0.1 vmmaplk kernel_map_store+0x4
82 crunch netbsd 4 0.1 0.0 vmmaplk kernel_map_store+0x4
80 my_snmptst netbsd 4 0.5 0.4 vmmaplk kernel_map_store+0x4
79 syslogd netbsd 4 0.0 0.1 vmmaplk kernel_map_store+0x4
77 my_xtest netbsd 24 0.0 0.0 select selwait
74 my_xp netbsd 4 0.1 0.3 vmmaplk kernel_map_store+0x4
72 my_event_tst netbsd 4 0.0 0.1 vmmaplk kernel_map_store+0x4
68 my_xp netbsd 40 0.3 0.6 lockf 0xe09f5140
64 crunch netbsd 20 0.0 0.2 vnlock 0x5f56c7c
56 crunch netbsd 24 0.0 0.0 select selwait
49 crunch netbsd 4 0.7 0.0 vmmaplk kernel_map_store+0x4
33 crunch netbsd 32 0.0 0.1 mfsidl 0x39fbe08
12 crunch netbsd 4 0.0 0.3 vmmaplk kernel_map_store+0x4
11 aiodoned netbsd 4 0.0 0.0 aiodoned uvm+0x34
10 ioflush netbsd 4 0.0 0.0 vmmaplk kernel_map_store+0x4
9 reaper netbsd 4 0.0 0.6 vmmaplk kernel_map_store+0x4
8 pagedaemon netbsd 4 0.0 0.0 pgdaemon uvm+0x28
7 nfsio netbsd 32 0.0 0.0 nfsidl nfs_iodwant+0xc
6 nfsio netbsd 32 0.0 0.0 nfsidl nfs_iodwant+0x8
5 nfsio netbsd 32 0.0 0.0 nfsidl nfs_iodwant+0x4
4 nfsio netbsd 32 0.0 0.1 nfsidl nfs_iodwant
1 crunch netbsd 32 0.0 0.1 wait 0x3246000
0 swapper netbsd 4 0.0 0.0 scheduler proc0
88 crunch netbsd 50 0.0 0.0
db>
From ps /a
74 my_xp 0x46d1568 0xefbaf000 0x324b708
my_xp proc = 0x46d1568 (PID=74)
0x46d1568: 46d1ac0 #p_forw=syslogd
0 #p_back
46d11d8 #p_list=my_event_tst (first proc)
46d1900
32471c8 #p_cred
3248480 #p_fd
3249090 #p_cwdi
efbaf128 #p_stats
159928 #p_limit
324b708 #p_vmspace
43f2c88 #p_sigacts
14 #p_exitsig
4006 #p_flag
16b4a0 #p_cpu (->struct schedstate_percpu ci_schedstate)
3000000 #p_stat=3
4a #PID (74)
0
e0892928
46d11d8
46d1778
3277728
46d11d8
46d1784
0
0
0
0
0
0
140a8c
0
e8f75ea0
0
37d4
46d1568
0
0
0
1271bc
2ed
29a
0
Find out PCB (ptr to pcb at proc+0x1b8):
db> examine /x 0x46d1568+0x1b8
0x46d1720: efbaf000
Display PCB contents:
db> examine /x efbaf000,10
0xefbaf000: e09e2a80 534da80 efbb2980 a efbb2ce8 1
0xefbaf018: ffff 21a238ac 0 0 0 0
0xefbaf030: 0 0 0
The SPL = 0xa and SP=efbb2980
in locore.s (cpu_switch):
lwz 1,PCB_SP(4) /* get new procs SP */
ori 3,3,PSL_EE@l /* interrupts are okay again */
mtmsr 3
lmw 10,8(1) /* get other regs */ /*N=32-10=22
lwz 1,0(1) /* get saved SP */
mr 2,12 /* get saved r2 */
mtcr 11 /* get saved cr */
isync
mtsr USER_SR,10 /* get saved USER_SR */
isync
switch_return:
mr 30,7 /* save proc pointer */
lwz 3,PCB_SPL(4)
bl _C_LABEL(lcsplx)
mr 3,30 /* get curproc for special fork
returns */
lwz 31,12(1)
lwz 30,8(1)
addi 1,1,16
lwz 0,4(1)
mtlr 0
blr
The map of the stack as pointed by PCB_SP(4):
+00 R01(sv) -> +00
+04
+08 R10 +08 R30
+0C R11 +0C R31
...... +10 <-Process R1
+5C R31 +14 <-Process LR
For pid 74 (my_xp)
0xefbb2980 :
+00 efbb29e0 ->
0xefbb29e0:
efbb29f0
5164
46d1568 R30
0 R31
efbb2a20 R01
2c1d0 LR
3c764d9b
cf7ea
efbb2a20
0
140a8c
8000
0
Call stack my_xp:
efbb2a20 2c1d0 mi_switch (2c020)
0xefbb2a20: efbb2a50 2b910 bpendtsleep(2b910) kern_synch.c#474
0xefbb2a50: efbb2a80 22ca0 lockmgr(229dc) kern_lock.c#543
0xefbb2a80: efbb2bc0 ca440 uvm_fault(ca2d8) uvm_fault_i.h#179
0xefbb2bc0: efbb2c30 e1c98 ->trap (e1b14) trap.c#129
0xefbb2c30: efbb2ce0 578c ->trapexit
0xefbb2ce0: efbb2d60 e6e54 ->intrctlr_dispatch (e6de8)
0xefbb2d60: efbb2d80 29cdc ->uiomove (29bb0)
0xefbb2d80: efbb2e10 bd884 ->ffs_write (bd5d8)
0xefbb2e10: efbb2e60 60944 ->vn_write (607a0)
0xefbb2e60: efbb2ec0 392c4 ->dofilewrite (391e4)
0xefbb2ec0: efbb2ee0 391c8 ->sys_write (3912c)
0xefbb2ee0: efbb2f50 e1f28 ->trap (e1b14)
0xefbb2f50: ffffe8b0 578c ->trapexit
db> show uvmexp
Current UVM status:
pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
127784 VM pages: 6899 active, 0 inactive, 117 wired, 110861 free
min 10% (25) anon, 10% (25) vnode, 5% (12) vtext
pages 3705 anon, 2383 vnode, 802 vtext
freemin=64, free-target=85, inactive-target=0, wired-max=42594
faults=18277, traps=0, intrs=0, ctxswitch=24476
softint=16374, syscalls=195770, swapins=0, swapouts=0
fault counts:
noram=0, noanon=0, pgwait=0, pgrele=0
ok relocks(total)=762(762), anget(retrys)=2322(0), amapcopy=1961
neighbor anon/obj pg=1434/23959, gets(lock/unlock)=11232/762
cases: anon=1464, anoncow=858, obj=7258, prcopy=3974, przero=4324
daemon and swap counts:
woke=0, revs=0, scans=0, obscans=0, anscans=0
busy=0, freed=0, reactivate=0, deactivate=0
pageouts=0, pending=0, nswget=0
nswapdev=0, nanon=119381, nanonneeded=119381 nfreeanon=115676
swpages=0, swpginuse=0, swpgonly=0 paging=0
kernel pointers:
objs(kern/kmem/mb)=0x140a24/0x140b20/0x140b38
0x32
db>
db> call uvmcnt_dump
# map lookup hint hits = 37834
# map lookup calls = 52478
# uvm_map() missed forward = 316
# uvm_map() back merges = 4950
# uvm_map() successful calls = 9201
0x15bacc
db>
>How-To-Repeat:
Unknown.
>Fix:
Unknown.
>Release-Note:
>Audit-Trail:
>Unformatted: