Subject: kern/9502: Interesting LFS problem
To: None <gnats-bugs@gnats.netbsd.org>
From: Jason R Thorpe <thorpej@nas.nasa.gov>
List: netbsd-bugs
Date: 02/28/2000 14:12:41
>Number: 9502
>Category: kern
>Synopsis: Interesting LFS problem
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people (Kernel Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Feb 28 14:12:00 2000
>Last-Modified:
>Originator:
>Organization:
Numerical Aerospace Simulation Facility - NASA Ames
>Release: NetBSD 1.4T, Feb 28 2000
>Environment:
System: NetBSD bishop 1.4T NetBSD 1.4T (BISHOP) #1010: Thu Feb 24 16:24:46 PST 2000 thorpej@bishop:/amd/dracul/u2/netbsd/src/sys/arch/alpha/compile/BISHOP alpha
>Description:
I needed to scrub out my object tree and decided to try using
LFS for it again.
I created an LFS file system and made /usr/obj point to it. This
file system is NFS exported, and 5 or 6 other systems mount that
file system for their /usr/obj as well.
While the server was churning along on a "make build", one of
the clients (an AlphaStation 500) was also doing a "make build".
The client failed to finish building libc:
cc -O2 -DALL_STATE -Wall -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Werror -D_LIBC -DNLS -DYP -DHESIOD -DLIBC_SCCS -DSYSLIBC_SCCS -D_REENTRANT -I/amd/dracul/u2/netbsd/src/lib/libc/include -DINET6 -D__DBINTERFACE_PRIVATE -DRESOLVSORT -I. -DPOSIX_MISTAKE -DFLOATING_POINT -c /amd/dracul/u2/netbsd/src/lib/libc/net/res_query.c
ld: cannot open output file res_query.o: Input/output error
*** Error code 1
Upon further investigation:
bishop:thorpej 103$ sudo touch obj.alpha/res_query.o
Password:
touch: obj.alpha/res_query.o: Input/output error
bishop:thorpej 104$ sudo touch obj.alpha/res_query.oaa
bishop:thorpej 105$ sudo rm obj.alpha/res_query.oaa
bishop:thorpej 106$ ls obj.alpha/res_query.*
ls: obj.alpha/res_query.o: No such file or directory
16 obj.alpha/res_query.ln 10 obj.alpha/res_query.o.o
The same problems happens on the server. I'm guessing a directory
entry is trashed.
Note that this may not be specific to LFS via NFS, but it may
be that it was easier to tickle this problem using this access
method.
fsck_lfs says:
dracul:thorpej 79$ fsck_lfs -n /dev/rsd3a
** /dev/rsd3a (NO WRITE)
** Last Mounted on /u3
** Phase 0 - Check Segment Summaries
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
UNALLOCATED I=6643
INO is NULL
REMOVE? no
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
17011 files, 139303 used, 0 free
dracul:thorpej 80$
>How-To-Repeat:
Not sure... it "just happened". I'll try and reproduce it again
after I re newfs_lfs it. (I can't remove that blasted file.)
However, I'll keep this file system around in case Konrad
as anything he wants me to try :-)
>Fix:
Unknown.
>Audit-Trail:
>Unformatted: