I have a system which is pretty normal i386 (not amd64) NetBSD 6 from the last month or two intel DH67CL mobo, Core i5-2310 (4 cpu) 8G RAM, of which 3569MB shows as available (since I haven't switched to amd64) DIAGNOSTIC, FFS_EI, -g, IPSEC/ESP/NAT, MROUTING (but not doing that) swwdog, ulpt commented out, using coda the system has not been running X lately OWC SSD for / (ffsv1, v2 superlock, no wapbl) /var /usr (ffsv2 wapbl) Seagate 1 TBish for /u0, 1 FS, FFSv2, WAPBL This has crashed a few times, almost always under heavy checkout/build load. I am unclear on if it's because of the power supply being stressed (which it shouldn't be) or because of the filesystem issue I'm writing about. When it crashes it panics in ffs code and fails to dump. I have NetBSD-current checked out (and -5 and -6 trees), and often do release builds for multiple architectures. I did an update, on a tree which previously had updated ok, and got a complaint about not being able to remove a directory (gdb6 in this case). I found a CVS directory which seemed to have nothing in it, as in "ls -la" showed no files, not even '.' and '..'. So to make progress I moved it to /u0/lost+found, and then I was able to remove the directories for the now-gone gdb bits. Then, I went to /u0/lost+found to look at CVS. Now, it had the usual Entries/Root/Repository/Template, plus . and .., quite normal. I removed the 4 files, and then rmdir'd CVS. So I wonder what's going on. It seems like the read from the disk of the blocks for the directory somehow ended up with an in-core representation of the directory as being empty, but the actual disk contents were ok. Moving the directory shouldn't have changed its inode #, so I wonder why the data got reread (perhaps due to maxvnodes or 104488 and doing a cvs update on -current). Might this be a locking bug surrounding vnode eviction? Hardware flakiness? Using memory too close to the 4G limit, more than should be used, due to something in the system messing with it?
Description: PGP signature