Subject: (long) NFS misbehaving under -current?
To: None <current-users@netbsd.org>
From: Brian C. Grayson <bgrayson@marvin.ece.utexas.edu>
List: current-users
Date: 10/07/1998 23:56:51
<Sorry about the length. I'm just trying to provide as much
info as I can.>
Is anyone else experiencing bad behavior from NFS? I built a
new system (sup -o, rm -rf kernel dir, config and build
kernel, reboot, cleandir userland, build userland, reboot),
in hopes that it was due to stale .o files or something, but to
no avail.
The symptoms I've noticed:
1. If a directory is larger than 1024 bytes, a readdir(?) on
it might not return anything:
% wc -c lSLMP* <-- these are directories:
5120 lSLMP.1
1024 lSLMP.1.static
6144 lSLMP.16
...
6144 lSLMP.8
1024 lSLMP.static.1
5120 lSLMPtest.4
% printf "%s\n" */*
<only prints files in lSLMP.1.static/ and lSLMP.static.1/, both
of which are 1024 byte directories>
Doing an ls -la in lSLMP.8, for example, doesn't print
_anything_, not even . and ..!
Under 1.3C, 1.3E, 1.3F, 1.3G (IIRC) it works as expected
(all i386 boxes).
This filesystem happens to be mounted from a Linux box. From
tcpdump -vv -s 1000 during `ls' in lSLMP.8, the server sends
back one readdir reply, followed by an ERR reply, followed by two
more ERR replies:
23:47:49.591810 marvin.3234575964 > sim6.nfs: 128 getattr fh Unknown/1 (ttl 64, id 8816)
23:47:49.598264 sim6.nfs > marvin.1019: . ack 494307729 win 32120 (DF) (ttl 64, id 38167)
23:47:49.619802 sim6.nfs > marvin.3234575964: reply ok 100 getattr DIR 40755 ids 1084/1001 sz 6144 (DF) (ttl 64, id 38168)
23:47:49.620001 marvin.3234575965 > sim6.nfs: 96 fsstat fh Unknown/1 (ttl 64, id 8818)
23:47:49.629489 sim6.nfs > marvin.3234575965: reply ok 52 fsstat tsize 8192 bsize 512 blocks 7916998 bfree 983556 bavail 573 956 (DF) (ttl 64, id 38171)
23:47:49.629752 marvin.3234575966 > sim6.nfs: 136 readdir fh Unknown/1 8192 bytes @ 0 (ttl 64, id 8819)
23:47:49.637243 sim6.nfs > marvin.3234575966: reply ok 1460 readdir offset 1 size 688326671 eof (DF) (ttl 64, id 38174)
23:47:49.638503 sim6.nfs > marvin.1869640494: reply ERR 1460 nop (DF) (ttl 64, id 38175)
23:47:49.638579 marvin.1019 > sim6.nfs: . ack 3072 win 14284 (ttl 64, id 8820)
23:47:49.639244 sim6.nfs > marvin.13: reply ERR 1080 nop (DF) (ttl 64, id 38176)
23:47:49.639340 marvin.1019 > sim6.nfs: . ack 4152 win 17200 (ttl 64, id 8821)
23:47:49.640411 sim6.nfs > marvin.7: reply ERR 1340 nop (DF) (ttl 64, id 38177)
23:47:49.691164 marvin.1019 > sim6.nfs: . ack 5492 win 15860 (ttl 64, id 8829)
23:47:54.121269 marvin.573243392 > sim6.nfs: 40 null (ttl 64, id 8831)
23:47:54.131618 sim6.nfs > marvin.573243392: reply ok 32 null (ttl 64, id 38805)
23:48:24.131246 marvin.1378549760 > sim6.nfs: 40 null (ttl 64, id 8883)
2. chmod for the sticky bit doesn't propogate from the flaky
client to the server or other clients very fast:
% ls -l foo
-rwxrwxrwx 1 bgrayson wheel 7 Sep 7 1997 foo*
% chmod 04777 foo
% ls -l foo
-rwsrwxrwx 1 bgrayson wheel 7 Sep 7 1997 foo*
% ssh r2d2 ls -l foo
-rwxrwxrwx 1 bgrayson wheel 7 Sep 7 1997 foo
% ssh r2d2 ls -l foo <-- after waiting a few more seconds:
-rwsrwxrwx 1 bgrayson wheel 7 Sep 7 1997 foo
This filesystem is served by a 1.3F NetBSD box. On the other
client machines, a chmod appears to happen instantly to all
other machines.
This is the first time in a long time that a buildable
-current system has been broken for me, so it seems highly
likely that I've goofed in some way. The last -current system we
had working was probably from July. Thanks in advance!
P.S. This is probably totally unrelated, but 12 files in
/usr/src/gnu/usr.bin/gawk had 512 bytes of 0xf6 in the middle of
them, on a 512-byte boundary. So far, no other files appear
affected. Does this look familiar to anyone? Is our disk going bad?
Here are my kernel configs (std.UT holds stuff for all
machines, and MARVIN holds the marvin-specific stuff):
------------------------ std.UT ----------------------------
#
# std.UT
#
include "arch/i386/conf/std.i386"
## bgrayson: doubling maxusers to 64, to allow more processes.
maxusers 64 # estimated number of users
options I586_CPU
options I686_CPU
options XSERVER,UCONSOLE
options INSECURE # insecure; allow /dev/mem writing for X
options RTC_OFFSET=0 # hardware clock is this many mins. west of GMT
options KTRACE
options SYSVMSG # System V-like message queues
options SYSVSEM # System V-like semaphores
options SYSVSHM # System V-like memory sharing
# bgrayson bumped SHMMAXPGS -- BSPlib seems to need more than 1024 when
# simulating more than 8 processors.
options SHMMAXPGS=4096 # 1024 pages is the default
options LKM # loadable kernel modules
options COMPAT_NOMID # compatibility with 386BSD, BSDI, NetBSD 0.8,
options COMPAT_09 # NetBSD 0.9,
options COMPAT_10 # NetBSD 1.0,
options COMPAT_11 # NetBSD 1.1,
options COMPAT_12 # NetBSD 1.2,
options COMPAT_13 # NetBSD 1.3,
options COMPAT_43 # and 4.3BSD
options COMPAT_386BSD_MBRPART # recognize old partition ID
options COMPAT_LINUX # binary compatibility with Linux
options LINUX_GCC_SIGNATURE ## (temporary?) modification
options COMPAT_FREEBSD # binary compatibility with FreeBSD
options EXEC_ELF32 # 32-bit ELF executables (SVR4, Linux)
file-system FFS # UFS
file-system EXT2FS # second extended file system (linux)
file-system MFS # memory file system
file-system NFS # Network File System client
file-system CD9660 # ISO 9660 + Rock Ridge file system
file-system MSDOSFS # MS-DOS file system
file-system FDESC # /dev/fd
file-system KERNFS # /kern
file-system NULLFS # loopback file system
file-system PORTAL # portal filesystem (still experimental)
file-system PROCFS # /proc
file-system UMAPFS # NULLFS + uid and gid remapping
file-system UNION # union file system
options NFSSERVER # Network File System server
options FIFO # FIFOs; RECOMMENDED
options INET # IP + ICMP + TCP + UDP
options NS # XNS
options ISO,TPIP # OSI
options EON # OSI tunneling over IP
options CCITT,LLC,HDLC # X.25
config netbsd root on ? type ?
options PCIVERBOSE # verbose PCI device messages
mainbus0 at root
pci* at mainbus? bus ?
pci* at pchb? bus ?
pci* at ppb? bus ?
pchb* at pci? dev ? function ? # PCI-Host bridges
pcib* at pci? dev ? function ? # PCI-ISA bridges
ppb* at pci? dev ? function ? # PCI-PCI bridges
isa* at mainbus? # all other ISA
isa* at pcib? # ISA on PCI-ISA bridge
pcppi0 at isa?
sysbeep0 at pcppi?
spkr0 at pcppi? # PC speaker
npx0 at isa? port 0xf0 irq 13 # math coprocessor
com0 at isa? port 0x3f8 irq 4 # standard PC serial ports
com1 at isa? port 0x2f8 irq 3
com2 at isa? port 0x3e8 irq 5
lpt0 at isa? port 0x378 irq 7 # standard PC parallel ports
fdc0 at isa? port 0x3f0 irq 6 drq 2 # standard PC floppy controllers
fd* at fdc? drive ?
## The ed driver no longer exists -- split into we and something else.
## We use the we version.
we0 at isa? port 0x280 iomem 0xd0000 irq 9 # WD/SMC Ethernet
pss0 at isa? port 0x220 irq 7 drq 6 # Personal Sound System
sp0 at pss0 port 0x530 irq 10 drq 0 # sound port driver
sb0 at isa? port 0x220 irq 7 drq 1 # SoundBlaster
#spkr0 at pckbcport? port 0x61 # Now at pcppi instead.
# PnP bus and devices should be declared last
isapnp0 at isa?
ep* at isapnp?
sb* at isapnp?
joy* at isapnp?
audio* at sb?
audio* at sp?
include "arch/i386/conf/GENERIC.local"
pseudo-device ccd 4 # concatenated disk devices
pseudo-device vnd 4 # paging to files
pseudo-device bpfilter 8 # packet filter
pseudo-device ipfilter # ip filter
pseudo-device loop 1 # network loopback
pseudo-device ppp 2 # PPP
pseudo-device sl 2 # CSLIP
pseudo-device tun 2 # network tunneling over tty
pseudo-device pty 64 # pseudo-terminals
pseudo-device tb 1 # tablet line discipline
-----------------------------------------------------------------
--------------------------- MARVIN ----------------------------
#
# MARVIN -- the guinea pig for the Chase group
#
include "arch/i386/conf/std.UT"
options DDB # in-kernel debugger
#makeoptions DEBUG="-g" # compile full symbol table
options DIAGNOSTIC # internal consistency checks
#options KTRACE # system call tracing, a la ktrace(1)
## The BIOSEXTMEM option is no longer needed, if the new boot blocks
## (from around Sep 97 or later) are on the boot disk. bgrayson
vt0 at isa? port 0x60 irq 1
wdc0 at isa? port 0x1f0 irq 14 # ST506, ESDI, and IDE controllers
wdc1 at isa? port 0x170 irq 15
wd* at wdc? drive ?
## ATAPI support
atapibus* at wdc?
cd* at atapibus? drive ?
#### For trying out the NCR SCSI card in marvin:
ncr* at pci? dev ? function ? # NCR 53c8xx SCSI
#### For trying out orac's old SCSI card in marvin:
ahc* at pci? dev ? function ? # Adaptec 294x, aic78x0 SCSI controllers
options SCSIVERBOSE
## Bus
scsibus* at ncr?
scsibus* at ahc?
## Devices...
sd* at scsibus? target ? lun ? # SCSI disk drives
st* at scsibus? target ? lun ? # SCSI tape drives
cd* at scsibus? target ? lun ? # SCSI CD-ROM drives
ch* at scsibus? target ? lun ? # SCSI autochangers
ss* at scsibus? target ? lun ? # SCSI scanners
uk* at scsibus? target ? lun ? # SCSI unknown
-------------------------------------------------------------