NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/32318: NFS client or server hang



On Sat, 24 May 2008 12:42:36 +0200 Manuel Bouyer wrote:
> On Fri, Dec 16, 2005 at 07:00:01PM +0000, Manuel Bouyer wrote:
> > >Number:         32318
> > >Category:       kern
> > >Synopsis:       NFS client or server hang
> > >Confidential:   no
> > >Severity:       serious
> > >Priority:       medium
> > >Responsible:    kern-bug-people
> > >State:          open
> > >Class:          sw-bug
> > >Submitter-Id:   net
> > >Arrival-Date:   Fri Dec 16 19:00:01 +0000 2005
> > >Originator:     Manuel Bouyer
> > >Release:        NetBSD 3.0_RC3
> > >Organization:
> > >Environment:
> > System: NetBSD chassiron.antioche.eu.org 3.0_RC3 NetBSD 3.0_RC3
> > (CHASSIRON) #0: Sat Nov 26 15:11:16 CET 2005
> > bouyer%pop.lip6.fr@localhost:/local/pop1/bouyer/tmp/sparc/obj/local/pop1/bouyer/netbsd-3/src/sys/arch/sparc/compile/CHASSIRON
> > sparc Architecture: sparc Machine: sparc
> > >Description:
> >     Setup: I get mail from various pop3 server via fetchmail and
> >     deliver to local folders (mbox format) via procmail, the
> > folders are stored on a NFS server.
> >     fetchmail/procmail run on a x86 box (celeron 500) running a
> > months-old current:
> > NetBSD rochebonne.antioche.eu.org 3.99.7 NetBSD 3.99.7 (ROCHEBONNE)
> > #1: Tue Aug  9 23:54:57 CEST 2005
> > bouyer%pop.lip6.fr@localhost:/local/pop1/bouyer/tmp/i386/obj/local/pop1/bouyer/current/src/sys/arch/i386/compile/ROCHEBONNE
> > i386 The NFS server is a sparc IPX (40Mhz sparcv7).
> > 
> >     Problem: from time to time, the process accessing the files
> > on the NFS server hang. This usually happens when the client does
> >     2 concurent accesses to the mailboxes (e.g. reading a
> > mailbox with mutt while procmail tries to deliver a mail to this
> > mailbox). I've seen this also before the 3.0 branch was cut, with
> > the NFS server running 2.0 or 2.1. I've never noticed this when the
> > server was running 1.6.2 (it started happening when the server got
> > upgraded).
> 
> I didn't see this since I upgraded the server to 4.99.62 (from
> 4.99.20 or so), neither during normal usage nor when trying
> explicitely to reproduce it. It's still be nice to have it fixed in
> netbsd-3 and netbsd-4, but at last this seems to have gone in
> current, and hopefully it won't show up again in netbsd-5 :)

Don't know is it related to, but:

server$ uname -a   
NetBSD nostromo.od5.lohika.com 4.99.63 NetBSD 4.99.63 (GENERIC) #0: Tue May 27 
21:10:03 EEST 2008 
mishka%nostromo.od5.lohika.com@localhost:/build/ab/obj-i386-20080527/sys/arch/i386/compile/GENERIC
 i386

client$ uname -a
NetBSD router3.od3.lohika.com 4.0_STABLE NetBSD 4.0_STABLE (ROUTER3) #0: Mon 
Feb 25 16:57:05 EET 2008 
mishka%nostromo.od3.lohika.com@localhost:/build/kernels/ROUTER3 i386
client$ ps axlw | grep vnlock
32767   194   193    0  -2  5   88   788 vnlock   DN   ?      0:00.00 find -s / 
( ! -fstype local -o -fstype cd9660 -o -fstype fdes
    0  1219 17394    0  -2  0   68   736 vnlock   D    ?      0:00.00 find / ( 
! -fstype local -o -fstype rdonly -o -fstype fdesc -
    0  2386  2545    0  -2  0   68   736 vnlock   D    ?      0:00.00 find / ( 
! -fstype local -o -fstype rdonly -o -fstype fdesc -
 1000  2771     1    0  -2  0  240   940 vnlock   Ds   ttyp0  0:00.01 -ksh 
    0 28768     1    0  -2  0   52   592 vnlock   D    ttyp0- 0:00.00 ls /build 
    0  2893     1    0  -2  0  252  1108 vnlock   D    ttyp1- 0:00.06 sh 
    0  5868     1    0  -2  0   24   492 vnlock   D    ttyp2- 0:00.01 umount -f 
/build 
    0 21977     1    0  -2  0   24   492 vnlock   D    ttyp2- 0:00.01 umount -f 
/build 
client$ mount -vt nfs 
nostromo.od5.lohika.com:/build on /build type nfs (read-only, fsid: 
0xb01/0x70b, reads: sync 0 async 0, writes: sync 0 async 0)
client$ grep /build /etc/fstab
nostromo.od5.lohika.com:/build /build nfs ro,noauto,-is

Sorry, I wasn't capable to grab a dump :(

--
Mishka.


Home | Main Index | Thread Index | Old Index