current-users: Re: xclock stuck in genput over NFS

Subject: Re: xclock stuck in genput over NFS
To: Steven M. Bellovin <smb@research.att.com>
From: Gary Duzan <gary@duzan.org>
List: current-users
Date: 10/04/2003 18:19:20
In Message <20031004172309.666907B43@berkshire.research.att.com> ,
   "Steven M. Bellovin" <smb@research.att.com> wrote:

=>In message <200310041629.h94GTu621518@wheel.duzan.org>, Gary Duzan writes:
=>>In Message <200310041624.h94GOrbk000330@capo.xnet.duzan.org> ,
=>>   Gary Duzan <gary@duzan.org> wrote:
=>>
=>>=>   With a kernel from this morning, every time I try to start up
=>>=>xclock it gets stuck in genput (as seen from "ps axl"), and afterwards
=>>=>any other access to the NFS mount which holds my home directory
=>>=>gets stuck in vnlock. If I don't start xclock, things seem ok. I tried
=>>=>attaching to xclock with gdb but didn't get a valid stack. Recompiling
=>>=>xclock didn't help either.
=>>
=>>   Quick update: xv hangs, too.
=>
=>I'm running a kernel from last night's source code, and I'm not seeing 
=>any such problems.  Not that I've tested exhaustively, but both xclock 
=>and xv work just fine. 

   Is your home directory on NFS? How about userland? I just updated to
a current userland (from pre-gcc3), as well.

   I got a kernel stack trace of the stuck xv process from ddb:

===========================================================================
trace: pid 600 at 0xe42e58fc
ltsleep(c0a33e20,204,c0341d87,0,e46ac238) at netbsd:ltsleep+0x3f3
genfs_putpages(e42e5b44,c0364c20,2d5,c120e800,0) at netbsd:genfs_putpages+0x6b4
vinvalbuf(e46ac238,1,c127a100,e42bc404,0) at netbsd:vinvalbuf+0x63
nfs_vinvalbuf(e46ac238,1,c127a100,e42bc404,1) at netbsd:nfs_vinvalbuf+0xe4
nfs_open(e42e5cf4,fffffffe,c0362b40,400006,c03bab20) at netbsd:nfs_open+0x164
vn_open(e42e5eb4,1,180,fffffff9,c038a690) at netbsd:vn_open+0x350
sys_open(e42e7448,e42e5f64,e42e5f5c,0,c0fd5800) at netbsd:sys_open+0xc5
syscall_plain(e42e5fa8,1f,1f,1f,1f) at netbsd:syscall_plain+0x74
===========================================================================

   I also managed to hang mhlist, which produced this trace:

===========================================================================
trace: pid 427 at 0xe4377bdc
ltsleep(c0960748,204,c0341d87,0,e43b73d0) at netbsd:ltsleep+0x3f3
genfs_putpages(e4377de4,c035f060,c05,20002,c031cee0) at netbsd:genfs_putpages+0x6b4
nfs_flush(e43b73d0,c123a780,1,e4286dc4,0) at netbsd:nfs_flush+0x5f
nfs_close(e4377e64,20002,3f7f3ba8,393ae4e8,c031c460) at netbsd:nfs_close+0x60
vn_close(e43b73d0,3,c123a780,e4286dc4,e42d2918) at netbsd:vn_close+0x4e
vn_closefile(e42d2918,e4286dc4,499,c0385a48,b01) at netbsd:vn_closefile+0x1a
closef(e42d2918,e4286dc4,e4377f5c,0,0) at netbsd:closef+0x106
syscall_plain(e4377fa8,1f,1f,1f,1f) at netbsd:syscall_plain+0x74
===========================================================================

   Rebooting the NFS server (NetBSD 1.6.x) doesn't help, either.

					Gary Duzan