Subject: kern/22080: switching with held simple_lock
To: None <gnats-bugs@gnats.netbsd.org>
From: None <mhitch@lightning.msu.montana.edu>
List: netbsd-bugs
Date: 07/06/2003 21:54:55
>Number:         22080
>Category:       kern
>Synopsis:       switching with held simple_lock
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jul 07 03:55:00 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator:     Michael L. Hitch
>Release:        NetBSD 1.6.1
>Organization:
Michael L. Hitch                        mhitch@montana.edu
Operations Consulting,  Information Technology Center
Montana State University, Bozeman, MT     USA
>Environment:
	
	
System: NetBSD netbsd2.msu.montana.edu 1.6.1 NetBSD 1.6.1 (CS20.MP) #5: Sat Jun 28 21:49:35 MDT 2003 mhitch@netbsd2.msu.montana.edu:/usr/NetBSD-1.6.1/obj/alphaev56/.alpha/sys/arch/alpha/compile/CS20.MP alpha
Architecture: alpha
Machine: alpha
>Description:
While running a LOCKDEBUG kernel on an API CS20, the following console was displayed
about "switching with held simple_lock".  This had previously been reported in
PR 11190, and closed.  I was able to get a stack trace in DDB, but was unable to
get a dump file to get a more detailed analysis.  This happened during the daily
run.  From the stack trace, it looks like checkalias() has obtained a lock on
spechasl_slock, then called vget(), which then tried to get another lock and
called ltsleep and tried to switch processes. An attempt to continue resulted
in a trap, and a further attempt to get a crash dump hung the system.


switching with held simple_lock 0xfffffc00005a3708 CPU 0 /usr/NetBSD-1.6.1/src/sys/kern/vfs_subr.c:1051

simple_lock: locking against myself
lock: 0xfffffc00005a2af8, currently at: /usr/NetBSD-1.6.1/src/sys/kern/sys_generic.c:961
on cpu 0
last locked: /usr/NetBSD-1.6.1/src/sys/kern/kern_synch.c:400
last unlocked: /usr/NetBSD-1.6.1/src/sys/kern/kern_synch.c:594
alpha trace requires known PC =eject=
Stopped in pid 23463 (find) at  cpu_Debugger+0x4:       ret     zero,(ra)
db{0}> t
cpu_Debugger() at cpu_Debugger+0x4
_simple_lock() at _simple_lock+0x130
selwakeup() at selwakeup+0xc0
logwakeup() at logwakeup+0x40
vprintf() at vprintf+0xe0
lock_printf() at lock_printf+0xa0
simple_lock_only_held() at simple_lock_only_held+0x158
simple_lock_switchcheck() at simple_lock_switchcheck+0x20
mi_switch() at mi_switch+0xbc
ltsleep() at ltsleep+0x49c
_lockmgr() at _lockmgr+0x11d8
genfs_lock() at genfs_lock+0x30
vn_lock() at vn_lock+0xf8
vget() at vget+0x1dc
checkalias() at checkalias+0x168
ufs_vinit() at ufs_vinit+0x88
ffs_vget() at ffs_vget+0x380
ufs_lookup() at ufs_lookup+0xe58
lookup() at lookup+0x4a4
namei() at namei+0x49c
sys___lstat13() at sys___lstat13+0x5c
syscall_plain() at syscall_plain+0x158
XentSys() at XentSys+0x5c
--- syscall (280) ---
--- user mode ---
db{0}> ps
 PID             PPID       PGRP        UID S   FLAGS          COMMAND    WAIT
 23465          23462      22900          0 3 0x84084             sort  piperd
 23464          23462      22900          0 3 0x84084            xargs  piperd
>How-To-Repeat:
  This has only happened once so far.
>Fix:
	
>Release-Note:
>Audit-Trail:
>Unformatted:
 	
 	
 >23463          23462      22900          0 3 0x84004             find  vnlock
  23462          23284      22900          0 3 0x80084               sh    wait
  23284          22906      22900          0 3 0x84084               sh    wait
  23277            272        272         12 3 0x84184            showq  select
  22913          22911      22900          0 3 0x84184         postdrop   netio
  22911          22900      22900          0 3 0x84084         sendmail  piperd
  22909          22900      22900          0 3 0x84084              tee  piperd
  22906          22900      22900          0 3 0x84084               sh    wait
  22900          22898      22900          0 3 0x84084               sh    wait
  22898            284        284          0 3 0x80084             cron  piperd
  20452            272        272         12 3 0x84184           pickup  select
  15509            323        323        271 7  0x4007       setiathome
  14816            305        305        271 2 0x84007       setiathome
  26864          26863      26864        271 3 0x84086               sh   ttyin
  26863          26862      26862        271 3 0x80086           script   ttyin
  26862          26861      26862        271 3 0x84086           script   ttyin
  26861            647      26861        271 3 0x84086               sh    wait
  17071          17070      17071        271 3 0x84086               sh   ttyin
  17070            301      17070        271 3 0x84086               sh    wait
  11435            296      11435        271 3 0x84186     screen-3.9.5   pause
  9769             334       9769        271 3 0x84086               sh   ttyin
  9217           28406       9217        271 3 0x84086              ssh  select
  688            28406        688          0 4  0x5006              csh
  649              647        649        271 3 0x84086            xload  select
  647              646        647        271 3 0x84086               sh    wait
  646              279        279          0 3 0x84184          rlogind  select
  28406          28405      28406        271 3 0x84086               sh    wait
  28405            279        279          0 3 0x84184          rlogind  select
  19119            169        169      32767 3 0x80184            httpd   lockf
  19118            169        169      32767 3 0x80184            httpd   lockf
  19117            169        169      32767 3 0x80184            httpd   lockf
  19116            169        169      32767 3 0x80184            httpd   lockf
  19115            169        169      32767 3 0x80184            httpd  select
  335              334        335        271 3 0x84086            xload  select
  334              301        334        271 3 0x84086               sh    wait
  324              322        324        271 3 0x84086        mon_seti3 nanosle
  323              322        323        271 3 0x84086               sh    wait
  322              301        322        271 3 0x84086               sh    wait
  306              304        306        271 3 0x84086        mon_seti3 nanosle
  305              304        305        271 3 0x84086               sh    wait
  304              301        304        271 3 0x84086               sh    wait
  303              302        303        271 3 0x84086              top  select
  302              301        302        271 3 0x84086               sh    wait
  301                1        301        271 3 0x80184     screen-3.9.5  select
  296              295        296        271 3 0x84086               sh    wait
  295              293        293        271 3 0x80184             sshd  select
  293              201        293          0 3 0x80184             sshd   netio
  289              272        272         12 3 0x84184             qmgr  select
  287                1          1          0 3 0x84084            getty nanosle
  286                1        286          0 3 0x84086            getty   ttyin
  284                1        284          0 3 0x80084             cron nanosle
  279                1        279          0 3 0x80084            inetd  select
  272                1        272          0 3 0x8418c           master  select
  209                1        209          0 3 0x80084        snmptrapd  select
  208              169        169      32767 3 0x80184            httpd   lockf
  207              169        169      32767 3 0x80184            httpd   netio
  206              169        169      32767 3 0x80184            httpd   lockf
  205              169        169      32767 3 0x80184            httpd   netio
  204              169        169      32767 3 0x80184            httpd   lockf
  201                1        201          0 3 0x80084             sshd  select
  194                1        194          0 3 0x80084            rwhod   netio
  184                1        184          0 3 0x80084             ntpd   pause
  172                1        172          0 3 0x80084              lpd  select
  169                1        169          0 3 0x80084            httpd  select
  142                1        142          0 3 0x80084           mountd  select
  121                1        121          0 3 0x80084        mount_mfs  mfsidl
  112                1        112          0 3 0x80084          rpcbind  select
  98                 1         98          0 3 0x80084          syslogd  select
  7                  0          0          0 3 0xa0204         aiodoned aiodone
  6                  0          0          0 3 0xa0204          ioflush biowait
  5                  0          0          0 3 0x20204           reaper  reaper
  4                  0          0          0 3 0xa0204       pagedaemon pgdaemo
  3                  0          0          0 3 0xa0204        atapibus0  sccomp
  2                  0          0          0 3 0xa0204         scsibus0  sccomp
  1                  0          1          0 3 0x84084             init    wait
  0                 -1          0          0 3 0xa0204          swapper schedul
 db{0}> c
 alpha trace requires known PC =eject=
 S
 oCpPpUe d1 :i nf aptiadl  2k3e4r6n3e l( ftirnadp):
 t
  CcPpUu _1D e b u gtgrearp+ 0exn4t:r y   =   0 x 2r e(tm e m o r yz emraon,a(gream)e
 td bf{a0u}l>t )
 CPU 1    a0         = 0x1b336ce0
 CPU 1    a1         = 0x1
 CPU 1    a2         = 0xffffffffffffffff
 CPU 1    pc         = 0x1b336ce0
 CPU 1    ra         = 0x1b336ce3
 CPU 1    pv         = 0xfffffc00003c7080
 CPU 1    curproc    = 0xfffffc0028280bd0
 CPU 1        pid = 23463, comm = find
 
 Caught exception in ddb.