Subject: kern/27085: 2.0 RC1 hangs in vnlock
To: None <gnats-bugs@gnats.NetBSD.org>
From: Martin Husemann <martin@aprisoft.de>
List: netbsd-bugs
Date: 09/30/2004 09:04:53
>Number:         27085
>Category:       kern
>Synopsis:       2.0 RC1 hangs in vnlock
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Sep 30 07:06:01 UTC 2004
>Closed-Date:
>Last-Modified:
>Originator:     Martin Husemann
>Release:        NetBSD 2.0_RC1
>Organization:
>Environment:
System: NetBSD burgvogt.aprisoft.de 2.0_RC1 NetBSD 2.0_RC1 (VOGT) #1: Mon Sep 27 09:02:52 CEST 2004  martin@emmas.aprisoft.de:/usr/src/sys/arch/sparc/compile/VOGT sparc
Architecture: sparc
Machine: sparc
>Description:

I upgraded my diskless sparc (SS2, sun4c) to 2.0_RC1 a few days ago. The 
machine has small, but constant network load, and hardly ever writes to
it's NFS root, loggin is done to a remote host, most accesses are read-only.

The previous 2.0_BETA kernel from Aug 31 worked without a problem, with a
single reboot in between. Now the new kernel hangs roughly once a day,
trying to access the NFS root, blocked in vnlock.

The machine is alive and even routes packets, but access to the file system
is blocks processes:

db> bt
cpu_Debugger(0xf03d6e70, 0x0, 0xf01dc1f8, 0x82c5c6, 0x100, 0xf01de1f8) at netbsd:zsc_intr_hard+0xf0
zsc_intr_hard(0x8, 0x7ffffc00, 0x2413820b, 0x3a1f, 0xffff, 0xf0002000) at netbsd:zshard+0x44
zshard(0x0, 0xf0149f7c, 0xd00, 0x118000e7, 0x82c598, 0xf01de1f8) at netbsd:sparc_interrupt44c+0x118
sparc_interrupt44c(0x1, 0x28, 0x0, 0x0, 0x0, 0x0) at netbsd:switchexit+0xd8
db> ps/w
 PID          COMMAND     EMUL  PRI UTIME STIME WAIT-MSG    WAIT-CHANNEL
 1020            sshd   netbsd   20   0.0   0.0 vnlock      netbsd:ddb_cpuinfo+0x26948f4
 1109              sh   netbsd   20   0.0   0.0 vnlock      netbsd:ddb_cpuinfo+0x26abd94
 1453              sh   netbsd   32   0.0   0.1 ppwait      netbsd:ddb_cpuinfo+0x269e37c
 1076            cron   netbsd   24   0.0   0.0 piperd      netbsd:ddb_cpuinfo+0x26b2eb4
 1531            tcsh   netbsd   20   0.0   0.1 vnlock      netbsd:ddb_cpuinfo+0x26e9c54
 1168            tcsh   netbsd   17   0.0   0.1 getblk      netbsd:ddb_cpuinfo+0x27e60c
 562            getty   netbsd   20   0.0   0.1 vnlock      netbsd:ddb_cpuinfo+0x26948f4
 573             cron   netbsd   20   0.5   5.0 vnlock      netbsd:ddb_cpuinfo+0x26948f4
 527            inetd   netbsd   24   0.0   0.0 kqread      netbsd:ddb_cpuinfo+0x29b884
 446             sshd   netbsd   20   0.0   0.1 vnlock      netbsd:ddb_cpuinfo+0x26948f4
 438           upsmon   netbsd   32  17.9  62.7 nanosleep   netbsd:nanowait.0
 462           upsmon   netbsd   24   0.0   0.0 piperd      netbsd:ddb_cpuinfo+0x26b2884
 401             ntpd   netbsd   20   6.1  16.8 vnlock      netbsd:ddb_cpuinfo+0x26948f4
 152         ifwatchd   netbsd   24   0.0   0.0 netio       netbsd:ddb_cpuinfo+0x228c2c
 147          syslogd   netbsd   24   1.2   0.7 poll        netbsd:selwait
 11          aiodoned   netbsd    4   0.0   0.0 aiodoned    netbsd:uvm+0x34
 10           ioflush   netbsd   40   0.0  10.5 syncer      netbsd:rushjob
 9         pagedaemon   netbsd    4   0.0   0.0 pgdaemon    netbsd:uvm+0x28
 8              nfsio   netbsd   32   0.0   0.2 nfsidl      netbsd:nfs_asyncdaemon+0x38
 7              nfsio   netbsd   32   0.0   0.3 nfsidl      netbsd:nfs_asyncdaemon+0x28
 6              nfsio   netbsd   32   0.0   0.5 nfsidl      netbsd:nfs_asyncdaemon+0x18
 5              nfsio   netbsd   32   0.0   1.7 nfsidl      netbsd:nfs_asyncdaemon+0x8
 4              nell0   netbsd   32   0.0   0.0 pcicev      netbsd:ddb_cpuinfo+0x1baed0
 3           scsibus1   netbsd   16   0.0   0.0 sccomp      netbsd:ddb_cpuinfo+0x1c2f8c
 2           scsibus0   netbsd   16   0.0   0.0 sccomp      netbsd:ddb_cpuinfo+0x1c338c
 1               init   netbsd   32   0.0   0.2 wait        netbsd:ddb_cpuinfo+0x1e7b884
 0            swapper   netbsd    4   0.0   0.3 scheduler   netbsd:proc0


>How-To-Repeat:
no idea, happens "often" for me on this machine.
>Fix:
n/a
>Release-Note:
>Audit-Trail:
>Unformatted: