Subject: kern/20646: deadlock on vnlock (?)
To: None <gnats-bugs@gnats.netbsd.org>
From: None <martin@duskware.de>
List: netbsd-bugs
Date: 03/10/2003 12:33:19
>Number: 20646
>Category: kern
>Synopsis: deadlock on vnlock(?)
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Mar 10 03:34:00 PST 2003
>Closed-Date:
>Last-Modified:
>Originator: Martin Husemann
>Release: NetBSD 1.6.1_RC2
>Organization:
>Environment:
System: NetBSD burgvogt.aprisoft.de 1.6.1_RC2 NetBSD 1.6.1_RC2 (VOGT) #0: Wed Mar 5 11:49:47 CET 2003 martin@beasty.aprisoft.de:/usr/src-1-6/sys/arch/sparc/compile/VOGT sparc
Architecture: sparc
Machine: sparc
>Description:
From time to time a sparc station 2 used as DSL router, running diskless from
a i386 NFS server build from the same sources, locks up. In this state the
kernel continues to route packets, but apparently the (NFS root) filesystem
is locked, so no new process can be forked.
Stopped at cpu_Debugger+0x4: jmpl [%o7 + 0x8], %g0
db> ps
PID PPID PGRP UID S FLAGS COMMAND WAIT
1061 176 1061 0 3 0x4 sshd vnlock
1060 192 192 0 3 0x100014 ifwatchd vnlock
272 1 272 0 3 0x104006 getty vnlock
240 1 240 0 3 0x4 ntpd nfsrcvl
200 1 200 0 3 0x4 cron netio
192 1 192 0 3 0x4 ifwatchd ppwait
189 1 189 0 3 0x84 inetd pause
176 1 176 0 3 0x4 sshd vnlock
78 1 78 0 3 0x84 syslogd select
11 0 0 0 3 0x20204 aiodoned aiodone
10 0 0 0 3 0x20204 ioflush syncer
9 0 0 0 3 0x20204 reaper reaper
8 0 0 0 3 0x20204 pagedaemon pgdaemo
7 0 0 0 3 0x20284 nfsio nfsidl
6 0 0 0 3 0x20284 nfsio nfsidl
5 0 0 0 3 0x20284 nfsio nfsidl
4 0 0 0 3 0x20284 nfsio nfsidl
3 0 0 0 3 0x20204 scsibus1 sccomp
2 0 0 0 3 0x20204 scsibus0 sccomp
1 0 1 0 3 0x4084 init wait
0 -1 0 0 3 0x20204 swapper schedul
db> ps/w
PID COMMAND EMUL PRI UTIME STIME WAIT-MSG WAIT-CHANNEL
1061 sshd netbsd 20 0.0 0.0 vnlock 0xf2f8106c
1060 ifwatchd netbsd 20 0.0 0.0 vnlock 0xf2f8106c
272 getty netbsd 20 0.0 0.1 vnlock 0xf2f8106c
240 ntpd netbsd 21 7.6 23.8 nfsrcvlk 0xf03675c0
200 cron netbsd 24 1.1 7.1 netio 0xf0368044
192 ifwatchd netbsd 32 0.0 0.0 ppwait 0xf30b33c0
189 inetd netbsd 40 0.0 0.0 pause 0xf2f99610
176 sshd netbsd 20 39.3 0.3 vnlock 0xf2f8106c
78 syslogd netbsd 24 1.0 0.6 select selwait
11 aiodoned netbsd 4 0.0 0.1 aiodoned uvm+0x34
10 ioflush netbsd 40 0.0 30.9 syncer rushjob
9 reaper netbsd 4 0.0 6.9 reaper deadproc
8 pagedaemon netbsd 4 0.0 0.0 pgdaemon uvm+0x28
7 nfsio netbsd 32 0.0 0.0 nfsidl nfs_iodwant+0xc
6 nfsio netbsd 32 0.0 0.0 nfsidl nfs_iodwant+0x8
5 nfsio netbsd 32 0.0 0.0 nfsidl nfs_iodwant+0x4
4 nfsio netbsd 32 0.0 0.3 nfsidl nfs_iodwant
3 scsibus1 netbsd 16 0.0 0.0 sccomp 0xf02f2500
2 scsibus0 netbsd 16 0.0 0.0 sccomp 0xf02f2900
1 init netbsd 32 0.0 0.2 wait 0xf276e000
0 swapper netbsd 4 0.0 1.2 scheduler proc0
db> ps/a:
1061 sshd 0xf2f88b18 0xf30e1000 0xf2f734b0
1060 ifwatchd 0xf2f88cf0 0xf30d6000 0xf2f73640
272 getty 0xf2f88590 0xf2f92000 0xf2f730c8
240 ntpd 0xf30b3b20 0xf30d8000 0xf2f737d0
200 cron 0xf30b3948 0xf30d1000 0xf2f73898
192 ifwatchd 0xf30b33c0 0xf30c4000 0xf2f73640
189 inetd 0xf30b31e8 0xf30bd000 0xf2f73190
176 sshd 0xf30b3010 0xf30b6000 0xf2f73320
78 syslogd 0xf2f88940 0xf3041000 0xf2f73258
11 aiodoned 0xf2f883b8 0xf2f8b000 0xf0195e18
10 ioflush 0xf2f881e0 0xf2f89000 0xf0195e18
9 reaper 0xf2f88008 0xf2f86000 0xf0195e18
8 pagedaemon 0xf276ece8 0xf2f84000 0xf0195e18
7 nfsio 0xf276eb10 0xf2f7f000 0xf0195e18
6 nfsio 0xf276e938 0xf2f7d000 0xf0195e18
5 nfsio 0xf276e760 0xf2f7b000 0xf0195e18
4 nfsio 0xf276e588 0xf2f79000 0xf0195e18
3 scsibus1 0xf276e3b0 0xf2f77000 0xf0195e18
2 scsibus0 0xf276e1d8 0xf2f75000 0xf0195e18
1 init 0xf276e000 0xf276c000 0xf2f73000
0 swapper 0xf0195ee8 0xf0170488 0xf0195e18
>How-To-Repeat:
Dunno - it happens regularily for me on this machine - the other machine (i386)
using the same source tree and facing higher network load is completely
stable.
>Fix:
n/a
>Release-Note:
>Audit-Trail:
>Unformatted: