Subject: 1.5 current NFS client problem?
To: 386 Port NetBSD <port-i386@netbsd.org>
From: Scott Presnell <srp@zgi.com>
List: current-users
Date: 06/16/2001 18:01:47
Hi Folks,
	I'm having some trouble with some recently updated NetBSD machines
and Netscape IMAP based Mail.  I've taken it about as far as I can, and now I
need some advice on how to troubleshoot this further, maybe in preparation
for a pr.

I read mail via IMAP servers on three different hosts from within
Netscape.  My home directory (which contains the .netscape data directory)
is served from a i386 NetBSD machine which was running 1.4.2. Call this
host A, the NFS server.

On an i386 machine running stock 1.4.2 using Netscape (either BSDI or Linux
versions), and which mounts my home from host A, I have no problem tracking
the different mailboxes.  Call this host B.

I noticed that when I tried to read mail on my laptop running 1.5T, using
my home from host A, I got corruption and/or some other kind of problem with
the inbox.summary files that netscape uses to track table of contents.
Eventually Netscape dumps, and I loose the summary file.  Call my laptop
host C.  I blew this off for a while, since I didn't use my laptop
connected to this network that often.

Yesterday I updated hosts A, and B, both the server and my usual client to
NetBSD 1.5V.  Now I see this inbox summary file corruption all the time on
host B as well.  It doesn't take long to happen, a couple checks of the
separate inboxes, and "poof".

I noticed that if I used a home directory local to my laptop, netscape
behaved much more reasonably.  Using that idea, on newly updated host B, I
set links to have netscape store the summary files on a local volume.  When
I do this, I can bang on netscape Mail until the cows come home and it
remains stable (again either BSDI or Linux versions of Netscape 4.76 or
4.77).... NFS client problem?

In experiments, I adjusted amd to use nfs version 2 and tried udp
transport to serve my home directory, but there was no change in the
behavior.

I grabbed a tcpdump about the time things go sour (I click on a new
inbox and I don't get a response, no summary).  I don't read NFS very well,
but near the bottom I can see that .Inbox.summary appears to be deleted but
is still open (thus renamed to a .nfs file) and that's the last stable
state of Netscape: any attempt to read from that inbox will crash the
process.  .58 is the client, .61 is the server.

This is rather irritating... 

     For starters, has anyone else experienced trouble like this?

     Can anyone help me interpret the communications: are the
     ERRS below significant?

     Where should I go next to troubleshoot this?

     Thanks for any help.

     - Scott



16:21:38.209439 xxx.61.2049 > xxx.58.3197248628: reply ok 116 getattr DIR 755 ids 1075/100 sz 512 
16:21:38.209666 xxx.58.3197248629 > xxx.61.2049: 116 access fh 0,6/81700 0002
16:21:38.209964 xxx.61.2049 > xxx.58.3197248629: reply ok 124 access c 0002
16:21:38.210093 xxx.58.3197248630 > xxx.61.2049: 112 getattr fh 0,6/81712
16:21:38.210376 xxx.61.2049 > xxx.58.3197248630: reply ok 116 getattr REG 644 ids 1075/100 sz 163840 
16:21:38.210516 xxx.58.3197248631 > xxx.61.2049: 132 lookup fh 0,6/81700 ".Inbox.summary"
16:21:38.210899 xxx.61.2049 > xxx.58.3197248631: reply ok 240 lookup fh 0,6/81712
16:21:38.211430 xxx.58.3197248632 > xxx.61.2049: 116 access fh 0,6/81712 000d
16:21:38.211721 xxx.61.2049 > xxx.58.3197248632: reply ok 124 access c 000d
16:21:38.212156 xxx.58.3197248633 > xxx.61.2049: 112 getattr fh 0,6/81712
16:21:38.212434 xxx.61.2049 > xxx.58.3197248633: reply ok 116 getattr REG 644 ids 1075/100 sz 163840 
16:21:38.212907 xxx.58.3197248634 > xxx.61.2049: 124 read fh 0,6/81712 32768 bytes @ 0
16:21:38.213023 xxx.58.3197248635 > xxx.61.2049: 124 read fh 0,6/81712 32768 bytes @ 32768
16:21:38.213218 xxx.61.2049 > xxx.58.1023: . ack 454000 win 65535 <nop,nop,timestamp 389 389>
16:21:38.214168 xxx.61.2049 > xxx.58.3197248634: reply ok 1448 read
16:21:38.214293 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.214357 xxx.58.1023 > xxx.61.2049: . ack 695761 win 63460 <nop,nop,timestamp 389 389>
16:21:38.214415 xxx.61.2049 > xxx.58.5240: reply ERR 1448
16:21:38.214539 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.214598 xxx.58.1023 > xxx.61.2049: . ack 698657 win 60564 <nop,nop,timestamp 389 389>
16:21:38.214661 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.214784 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.214841 xxx.58.1023 > xxx.61.2049: . ack 701553 win 57668 <nop,nop,timestamp 389 389>
16:21:38.214908 xxx.61.2049 > xxx.58.563085312: reply ok 1448
16:21:38.215030 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.215090 xxx.58.1023 > xxx.61.2049: . ack 704449 win 54772 <nop,nop,timestamp 389 389>
16:21:38.215153 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.215286 xxx.61.2049 > xxx.58.0: reply ERR 1448 null
16:21:38.215343 xxx.58.1023 > xxx.61.2049: . ack 707345 win 51876 <nop,nop,timestamp 389 389>
16:21:38.215400 xxx.61.2049 > xxx.58.3782545016: reply ERR 1448
16:21:38.215524 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.215581 xxx.58.1023 > xxx.61.2049: . ack 710241 win 48980 <nop,nop,timestamp 389 389>
16:21:38.215645 xxx.61.2049 > xxx.58.65536: reply ERR 1448
16:21:38.215771 xxx.61.2049 > xxx.58.3221356544: reply ERR 1448
16:21:38.215833 xxx.58.1023 > xxx.61.2049: . ack 713137 win 46084 <nop,nop,timestamp 389 389>
16:21:38.215896 xxx.61.2049 > xxx.58.3221356544: reply ERR 1448
16:21:38.216022 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.216083 xxx.58.1023 > xxx.61.2049: . ack 716033 win 43188 <nop,nop,timestamp 389 389>
16:21:38.216138 xxx.61.2049 > xxx.58.3296854016: reply ERR 1448
16:21:38.216265 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.216353 xxx.58.1023 > xxx.61.2049: . ack 718929 win 40292 <nop,nop,timestamp 389 389>
16:21:38.216394 xxx.61.2049 > xxx.58.108688: reply ERR 1448
16:21:38.216514 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.216576 xxx.58.1023 > xxx.61.2049: . ack 721825 win 37396 <nop,nop,timestamp 389 389>
16:21:38.216624 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.216759 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.216823 xxx.58.1023 > xxx.61.2049: . ack 724721 win 34500 <nop,nop,timestamp 389 389>
16:21:38.216854 xxx.61.2049 > xxx.58.0: reply ok 1044 null
16:21:38.216944 xxx.58.1023 > xxx.61.2049: . ack 725765 win 65535 <nop,nop,timestamp 389 389>
16:21:38.216968 xxx.61.2049 > xxx.58.3197248635: reply ok 1448 read
16:21:38.217092 xxx.61.2049 > xxx.58.99040: reply ERR 1448
16:21:38.217156 xxx.58.1023 > xxx.61.2049: . ack 728661 win 63456 <nop,nop,timestamp 389 389>
16:21:38.217215 xxx.61.2049 > xxx.58.0: reply ERR 1448 null
16:21:38.217341 xxx.61.2049 > xxx.58.2319253504: reply ok 1448
16:21:38.217397 xxx.58.1023 > xxx.61.2049: . ack 731557 win 60560 <nop,nop,timestamp 389 389>
16:21:38.217463 xxx.61.2049 > xxx.58.3198649860: reply ERR 1448
16:21:38.217585 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.217645 xxx.58.1023 > xxx.61.2049: . ack 734453 win 57664 <nop,nop,timestamp 389 389>
16:21:38.217707 xxx.61.2049 > xxx.58.84048: reply ERR 1448
16:21:38.217830 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.217887 xxx.58.1023 > xxx.61.2049: . ack 737349 win 54768 <nop,nop,timestamp 389 389>
16:21:38.217955 xxx.61.2049 > xxx.58.2899312640: reply ERR 1448
16:21:38.218083 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.218148 xxx.58.1023 > xxx.61.2049: . ack 740245 win 51872 <nop,nop,timestamp 389 389>
16:21:38.218191 xxx.61.2049 > xxx.58.938: reply ok 1448
16:21:38.218321 xxx.61.2049 > xxx.58.3184001024: reply ERR 1448
16:21:38.218385 xxx.58.1023 > xxx.61.2049: . ack 743141 win 48976 <nop,nop,timestamp 389 389>
16:21:38.218445 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.218572 xxx.61.2049 > xxx.58.1852140652: reply ERR 1448
16:21:38.218638 xxx.58.1023 > xxx.61.2049: . ack 746037 win 46080 <nop,nop,timestamp 389 389>
16:21:38.218689 xxx.61.2049 > xxx.58.1663989099: reply ERR 1448
16:21:38.218813 xxx.61.2049 > xxx.58.1495282224: reply ERR 1448
16:21:38.218875 xxx.58.1023 > xxx.61.2049: . ack 748933 win 43184 <nop,nop,timestamp 389 389>
16:21:38.218941 xxx.61.2049 > xxx.58.1415931245: reply ERR 1448
16:21:38.219065 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.219139 xxx.58.1023 > xxx.61.2049: . ack 751829 win 40292 <nop,nop,timestamp 389 389>
16:21:38.219178 xxx.61.2049 > xxx.58.4369: reply ERR 1448
16:21:38.219308 xxx.61.2049 > xxx.58.822083584: reply ERR 1448
16:21:38.219376 xxx.58.1023 > xxx.61.2049: . ack 754725 win 37396 <nop,nop,timestamp 389 389>
16:21:38.219438 xxx.61.2049 > xxx.58.65536: reply ERR 1448
16:21:38.219566 xxx.61.2049 > xxx.58.3292659712: reply ERR 1448
16:21:38.219638 xxx.58.1023 > xxx.61.2049: . ack 757621 win 34500 <nop,nop,timestamp 389 389>
16:21:38.219659 xxx.61.2049 > xxx.58.0: reply ERR 1044 null
16:21:38.219746 xxx.58.1023 > xxx.61.2049: . ack 758665 win 65535 <nop,nop,timestamp 389 389>
16:21:38.221986 xxx.58.3197248636 > xxx.61.2049: 148 setattr fh 0,6/81712
16:21:38.222400 xxx.61.2049 > xxx.58.3197248636: reply ok 148 setattr [|nfs]
16:21:38.223056 xxx.58.3197248637 > xxx.61.2049: 124 read fh 0,6/81712 32768 bytes @ 0
16:21:38.224042 xxx.61.2049 > xxx.58.3197248637: reply ok 1448 read
16:21:38.224161 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.224241 xxx.58.1023 > xxx.61.2049: . ack 761709 win 63460 <nop,nop,timestamp 389 389>
16:21:38.224283 xxx.61.2049 > xxx.58.5240: reply ERR 1448
16:21:38.224407 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.224465 xxx.58.1023 > xxx.61.2049: . ack 764605 win 60564 <nop,nop,timestamp 389 389>
16:21:38.224527 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.224651 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.224713 xxx.58.1023 > xxx.61.2049: . ack 767501 win 57668 <nop,nop,timestamp 389 389>
16:21:38.224772 xxx.61.2049 > xxx.58.563085312: reply ok 1448
16:21:38.224899 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.224957 xxx.58.1023 > xxx.61.2049: . ack 770397 win 54772 <nop,nop,timestamp 389 389>
16:21:38.225018 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.225143 xxx.61.2049 > xxx.58.0: reply ERR 1448 null
16:21:38.225203 xxx.58.1023 > xxx.61.2049: . ack 773293 win 51876 <nop,nop,timestamp 389 389>
16:21:38.225267 xxx.61.2049 > xxx.58.3782545016: reply ERR 1448
16:21:38.225390 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.225452 xxx.58.1023 > xxx.61.2049: . ack 776189 win 48980 <nop,nop,timestamp 389 389>
16:21:38.225513 xxx.61.2049 > xxx.58.65536: reply ERR 1448
16:21:38.225638 xxx.61.2049 > xxx.58.3221356544: reply ERR 1448
16:21:38.225698 xxx.58.1023 > xxx.61.2049: . ack 779085 win 46084 <nop,nop,timestamp 389 389>
16:21:38.225758 xxx.61.2049 > xxx.58.3221356544: reply ERR 1448
16:21:38.225889 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.225951 xxx.58.1023 > xxx.61.2049: . ack 781981 win 43188 <nop,nop,timestamp 389 389>
16:21:38.226004 xxx.61.2049 > xxx.58.3296854016: reply ERR 1448
16:21:38.226135 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.226201 xxx.58.1023 > xxx.61.2049: . ack 784877 win 40292 <nop,nop,timestamp 389 389>
16:21:38.226244 xxx.61.2049 > xxx.58.108688: reply ERR 1448
16:21:38.226377 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.226442 xxx.58.1023 > xxx.61.2049: . ack 787773 win 37396 <nop,nop,timestamp 389 389>
16:21:38.226495 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.226636 xxx.61.2049 > xxx.58.0: reply ok 1448 null
16:21:38.226709 xxx.61.2049 > xxx.58.0: reply ok 1044 null
16:21:38.226731 xxx.58.1023 > xxx.61.2049: . ack 790669 win 34500 <nop,nop,timestamp 389 389>
16:21:38.226945 xxx.58.1023 > xxx.61.2049: . ack 791713 win 65535 <nop,nop,timestamp 389 389>
16:21:38.227703 xxx.58.3197248638 > xxx.61.2049: 1448 write fh 0,6/81712 4096 bytes @ 0
16:21:38.227744 xxx.58.0 > xxx.61.2049: 1448 null
16:21:38.227838 xxx.58.5240 > xxx.61.2049: 1332 proc-14369918
16:21:38.228330 xxx.61.2049 > xxx.58.1023: . ack 457168 win 65535 <nop,nop,timestamp 389 389>
16:21:38.228404 xxx.61.2049 > xxx.58.1023: . ack 458500 win 65535 <nop,nop,timestamp 389 389>
16:21:38.228785 xxx.61.2049 > xxx.58.3197248638: reply ok 164 write [|nfs]
16:21:38.228973 xxx.58.3197248639 > xxx.61.2049: 124 commit fh 0,6/81712 4096 bytes @ 0
16:21:38.231719 xxx.61.2049 > xxx.58.3197248639: reply ok 156 commit [|nfs]
16:21:38.232294 xxx.58.3197248640 > xxx.61.2049: 132 lookup fh 0,6/81700 ".Inbox.summary"
16:21:38.232682 xxx.61.2049 > xxx.58.3197248640: reply ok 240 lookup fh 0,6/81712
16:21:38.232866 xxx.58.3197248641 > xxx.61.2049: 128 lookup fh 0,6/81700 ".nfsA012a4.4"
16:21:38.233230 xxx.61.2049 > xxx.58.3197248641: reply ok 120 lookup ERROR: No such file or directory
16:21:38.233365 xxx.58.3197248642 > xxx.61.2049: 180 rename fh 0,6/81700 ".Inbox.summary" -> fh 0,6/81700 ".nfsA012a4.4"
16:21:38.235269 xxx.61.2049 > xxx.58.3197248642: reply ok 264 rename
16:21:38.235410 xxx.58.3197248643 > xxx.61.2049: 128 lookup fh 0,6/81700 ".nfsA012a4.4"
16:21:38.235787 xxx.61.2049 > xxx.58.3197248643: reply ok 240 lookup fh 0,6/81712
16:21:38.336403 xxx.58.1023 > xxx.61.2049: . ack 792897 win 65535 <nop,nop,timestamp 389 389>