Subject: Wow, I really hosed my filesystem
To: None <current-users@netbsd.org>
From: Dave Huang <khym@azeotrope.org>
List: current-users
Date: 07/04/2002 15:16:27
This is on i386, 1.6_BETA4 kernel from yesterday: I was running build.sh
to update my userland stuff, and it had finished compiling everything
and had gotten to doing the install in /bin. I have softdeps enabled on
/, and am low on disk space on that partition, so I usually get that
annoying file system full error when does the install in /bin and /sbin.
So, I did a "sync", since it seems like that helps a bit (although it
may just be the placebo effect :) Well, that gave me a kernel panic:
softdep_write_inodeblock: indirect pointer #0 mismatch 0 != 1386

Then it did a stack trace, syncing disks..., then
panic: softdep_disk_write_complete: lock is held

Another traceback, crash dump, then another panic:
softdep_disk_write_complete: lock is held

Another traceback, crash dump, and finally it rebooted. Here's the
dmesg:

uid 0 comm nbinstall on /: file system full
panic: softdep_write_inodeblock: indirect pointer #0 mismatch 0 != 1386
Begin traceback...
initiate_write_inodeblock(cb48fa84,c36c0410,c36c0410,c03693e0) at initiate_write_inodeblock+0x1cd
softdep_disk_io_initiation(c36c0410,4,cb368a90,c01d0b5f) at softdep_disk_io_initiation+0x9a
spec_strategy(cb368aa8,cb51e8a0,c0691000,c01d035c,1) at spec_strategy+0x2c
VOP_STRATEGY(c36c0410,cb044390,c36cd4e0,c03693e0) at VOP_STRATEGY+0x2b
bwrite(c36c0410,c36c0410,1,cb21ac5f,cb06209c) at bwrite+0xe3
ffs_update(cb368b5c,c36cd4e0,cb368b70,c01d05c9,cb51e8a0) at ffs_update+0x27f
VOP_UPDATE(cb06209c,0,0,1,c078d800) at VOP_UPDATE+0x3b
ffs_balloc(cb368c90,cb368d0c,cb368ca0,c024fd8f,cb06209c) at ffs_balloc+0xcf3
VOP_BALLOC(cb06209c,18000,0,2000,c078d800) at VOP_BALLOC+0x4f
ffs_gop_alloc(cb06209c,18000,0,2000,0) at ffs_gop_alloc+0x8b
ffs_write(cb368e4c,30002,1,0,499c0) at ffs_write+0x5e5
VOP_WRITE(cb06209c,cb368ee0,1,c078d800,cb368f80) at VOP_WRITE+0x3b
vn_write(cb057068,cb057090,cb368ee0,c078d800,1) at vn_write+0x9f
dofilewrite(cb21aae0,3,cb057068,48116000,499c0) at dofilewrite+0x9b
sys_write(cb21aae0,cb368f80,cb368f78,c0260974) at sys_write+0x67
syscall_plain(1f,1f,1f,1f,48116000) at syscall_plain+0xa7
End traceback...
syncing disks... panic: softdep_disk_write_complete: lock is held
Begin traceback...
softdep_disk_write_complete(c36b9660,2000,cb368438,c026d871) at softdep_disk_write_complete+0x20
biodone(c36b9660,0,c07f0580,2fff12bb) at biodone+0x52
scsipi_complete(c07f0f00,0,201dc243,3a4f2,c07f0f00) at scsipi_complete+0x431
scsipi_done(c07f0f00,c0771900,1,c0146784,c07ce608) at scsipi_done+0x10d
esiop_scsicmd_end(c07ce608,7fffffff,80000000,0) at esiop_scsicmd_end+0x17d
esiop_checkdone(c063f200,0,c0146784,ca7e9000) at esiop_checkdone+0x35e
esiop_intr(c063f200) at esiop_intr+0x7c
Xintr10() at Xintr10+0x7e
--- interrupt ---
genfs_putpages(cb368774,0,0,0,0) at genfs_putpages+0x4f1
ffs_putpages(cb368774,c0265272,cb36878c,c07bfdc0,0) at ffs_putpages+0x11d
VOP_PUTPAGES(cb1621d4,0,0,0,0,11,c06b9000,0) at VOP_PUTPAGES+0x49
ffs_full_fsync(cb36888c,cb1621d4,c06b9000,c01dbf72,0) at ffs_full_fsync+0x89
ffs_fsync(cb36888c,10012,cb3688a0,c01dbf72,cb1621d4) at ffs_fsync+0x3c
VOP_FSYNC(cb1621d4,c078d800,0,0,0,0,0,cb21aae0) at VOP_FSYNC+0x58
ffs_sync(c0693800,2,c078d800,cb21aae0) at ffs_sync+0xcf
sys_sync(cb21aae0,0,0,c01d62c0,100) at sys_sync+0x5a
vfs_shutdown(cb3689cc,1,fff0,c02d9ed2,c01ba768) at vfs_shutdown+0x6a
cpu_reboot(100,0,c0677f40,cb3689ac,cb7526d8) at cpu_reboot+0x3b
panic(c02c6a40,c02c69e4,0,0,56a) at panic+0x123
initiate_write_inodeblock(cb48fa84,c36c0410,c36c0410,c03693e0) at initiate_write_inodeblock+0x1cd
softdep_disk_io_initiation(c36c0410,4,cb368a90,c01d0b5f) at softdep_disk_io_initiation+0x9a
spec_strategy(cb368aa8,cb51e8a0,c0691000,c01d035c,1) at spec_strategy+0x2c
VOP_STRATEGY(c36c0410,cb044390,c36cd4e0,c03693e0) at VOP_STRATEGY+0x2b
bwrite(c36c0410,c36c0410,1,cb21ac5f,cb06209c) at bwrite+0xe3
ffs_update(cb368b5c,c36cd4e0,cb368b70,c01d05c9,cb51e8a0) at ffs_update+0x27f
VOP_UPDATE(cb06209c,0,0,1,c078d800) at VOP_UPDATE+0x3b
ffs_balloc(cb368c90,cb368d0c,cb368ca0,c024fd8f,cb06209c) at ffs_balloc+0xcf3
VOP_BALLOC(cb06209c,18000,0,2000,c078d800) at VOP_BALLOC+0x4f
ffs_gop_alloc(cb06209c,18000,0,2000,0) at ffs_gop_alloc+0x8b
ffs_write(cb368e4c,30002,1,0,499c0) at ffs_write+0x5e5
VOP_WRITE(cb06209c,cb368ee0,1,c078d800,cb368f80) at VOP_WRITE+0x3b
vn_write(cb057068,cb057090,cb368ee0,c078d800,1) at vn_write+0x9f
dofilewrite(cb21aae0,3,cb057068,48116000,499c0) at dofilewrite+0x9b
sys_write(cb21aae0,cb368f80,cb368f78,c0260974) at sys_write+0x67
syscall_plain(1f,1f,1f,1f,48116000) at syscall_plain+0xa7
End traceback...

dumping to dev 4,1 offset 2831
dump 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 succeeded


panic: softdep_disk_write_complete: lock is held
Begin traceback...
softdep_disk_write_complete(c36cd9c0,2000,1,1) at softdep_disk_write_complete+0x20
biodone(c36cd9c0,c063f400,1,c0351560) at biodone+0x52
scsipi_complete(c07f0dd0,200,c03515f8,0,c07f0dd0) at scsipi_complete+0x431
scsipi_done(c07f0dd0,c07649c0,cb3680f8,c026e5fb,c07446a4) at scsipi_done+0x10d
esiop_scsicmd_end(c07446a4,c063f200,cb368128,c02567b0) at esiop_scsicmd_end+0x17d
esiop_checkdone(c063f200,cb2d2c6c,78,0) at esiop_checkdone+0x35e
esiop_intr(c063f200,c0677780,cb3681d8,c010b371,3e8,c074489c,0,0) at esiop_intr+0x7c
esiop_scsipi_request(c063f22c,0,c07bf868,1) at esiop_scsipi_request+0x46a
scsipi_run_queue(c063f22c,fff0018,cb368248,c02646b7) at scsipi_run_queue+0x1ab
scsipi_execute_xs(c07bf868,cb36830c,a,c0106f46,103) at scsipi_execute_xs+0x208
scsi_scsipi_cmd(c0679100,cb36830c,a,0,0) at scsi_scsipi_cmd+0xe7
scsipi_command(c0679100,cb36830c,a,0,0) at scsipi_command+0x68
sd_scsibus_flush(c0671600,3,8000000,1) at sd_scsibus_flush+0x5d
sd_shutdown(c0671600,40800,2,7f00000) at sd_shutdown+0x2a
doshutdownhooks(cb3683c4,1,ffdd,c02d9ed2,c01ba768) at doshutdownhooks+0x26
cpu_reboot(104,0,c0679100,22009,c36b9660) at cpu_reboot+0x68
panic(c02c6b40,c07f0f00,c06716dc,48d5cf,c36b9660) at panic+0x123
softdep_disk_write_complete(c36b9660,2000,cb368438,c026d871) at softdep_disk_write_complete+0x20
biodone(c36b9660,0,c07f0580,2fff12bb) at biodone+0x52
scsipi_complete(c07f0f00,0,201dc243,3a4f2,c07f0f00) at scsipi_complete+0x431
scsipi_done(c07f0f00,c0771900,1,c0146784,c07ce608) at scsipi_done+0x10d
esiop_scsicmd_end(c07ce608,7fffffff,80000000,0) at esiop_scsicmd_end+0x17d
esiop_checkdone(c063f200,0,c0146784,ca7e9000) at esiop_checkdone+0x35e
esiop_intr(c063f200) at esiop_intr+0x7c
Xintr10() at Xintr10+0x7e
--- interrupt ---
genfs_putpages(cb368774,0,0,0,0) at genfs_putpages+0x4f1
ffs_putpages(cb368774,c0265272,cb36878c,c07bfdc0,0) at ffs_putpages+0x11d
VOP_PUTPAGES(cb1621d4,0,0,0,0,11,c06b9000,0) at VOP_PUTPAGES+0x49
ffs_full_fsync(cb36888c,cb1621d4,c06b9000,c01dbf72,0) at ffs_full_fsync+0x89
ffs_fsync(cb36888c,10012,cb3688a0,c01dbf72,cb1621d4) at ffs_fsync+0x3c
VOP_FSYNC(cb1621d4,c078d800,0,0,0,0,0,cb21aae0) at VOP_FSYNC+0x58
ffs_sync(c0693800,2,c078d800,cb21aae0) at ffs_sync+0xcf
sys_sync(cb21aae0,0,0,c01d62c0,100) at sys_sync+0x5a
vfs_shutdown(cb3689cc,1,fff0,c02d9ed2,c01ba768) at vfs_shutdown+0x6a
cpu_reboot(100,0,c0677f40,cb3689ac,cb7526d8) at cpu_reboot+0x3b
panic(c02c6a40,c02c69e4,0,0,56a) at panic+0x123
initiate_write_inodeblock(cb48fa84,c36c0410,c36c0410,c03693e0) at initiate_write_inodeblock+0x1cd
softdep_disk_io_initiation(c36c0410,4,cb368a90,c01d0b5f) at softdep_disk_io_initiation+0x9a
spec_strategy(cb368aa8,cb51e8a0,c0691000,c01d035c,1) at spec_strategy+0x2c
VOP_STRATEGY(c36c0410,cb044390,c36cd4e0,c03693e0) at VOP_STRATEGY+0x2b
bwrite(c36c0410,c36c0410,1,cb21ac5f,cb06209c) at bwrite+0xe3
ffs_update(cb368b5c,c36cd4e0,cb368b70,c01d05c9,cb51e8a0) at ffs_update+0x27f
VOP_UPDATE(cb06209c,0,0,1,c078d800) at VOP_UPDATE+0x3b
ffs_balloc(cb368c90,cb368d0c,cb368ca0,c024fd8f,cb06209c) at ffs_balloc+0xcf3
VOP_BALLOC(cb06209c,18000,0,2000,c078d800) at VOP_BALLOC+0x4f
ffs_gop_alloc(cb06209c,18000,0,2000,0) at ffs_gop_alloc+0x8b
ffs_write(cb368e4c,30002,1,0,499c0) at ffs_write+0x5e5
VOP_WRITE(cb06209c,cb368ee0,1,c078d800,cb368f80) at VOP_WRITE+0x3b
vn_write(cb057068,cb057090,cb368ee0,c078d800,1) at vn_write+0x9f
dofilewrite(cb21aae0,3,cb057068,48116000,499c0) at dofilewrite+0x9b
sys_write(cb21aae0,cb368f80,cb368f78,c0260974) at sys_write+0x67
syscall_plain(1f,1f,1f,1f,48116000) at syscall_plain+0xa7
End traceback...

dumping to dev 4,1 offset 2831
dump 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

When it came back up, fsck -p cranked along, then didn't know what to
do with:

UNALLOCATED  I=181114  OWNER=root MODE=0
SIZE=0 MTIME=Jul  4 11:58 2002
NAME=?/../../../../../../../../../../../../../../../../../../../../../../../../../../ [a WHOLE bunch more ../s, probably about 800 characters total]/CVS

I'm suspect that's unrelated to the cause of the panic though; I use a
union mount over /usr/src and do all of my building on the top layer of
the union, and a bit earlier, I had noticed that I had some empty
directories in the upper layer that no longer existed in the real
/usr/src. So I figured I'd first get rid of the CVS directories, by
doing "find . -name CVS | xargs rmdir". But that ended up removing all
the CVS directories, not just the ones that were empty (at least I think
that's what happened). So I did a "find . -type W | xargs rm -W" [1] to
remove the whiteouts from the upper layer, only to get a few "Bad file
descriptor" errors. An ls showed that CVS directories had turned into
CVS files?! Actually, I use tcsh's built-in "ls-F", and it didn't show
the trailing slash, so I assume they were now files; doing a ls -l gave
me "Bad file descriptor".

So I left it at that figuring I'd mess with it later... then I got the
panic and the fsck failure. Running fsck manually (without -p) let me
remove the stray CVS files/directories, and things seem well again.

[1] Nobody's fixed my pet bug report yet, PR bin/5419 "find's "-type W"
option doesn't work". I'm using my patched version of find where -type
W does work. C'mon, it's been over 4 years now, and it's a trivial
patch!
-- 
Name: Dave Huang         |  Mammal, mammal / their names are called /
INet: khym@azeotrope.org |  they raise a paw / the bat, the cat /
FurryMUCK: Dahan         |  dolphin and dog / koala bear and hog -- TMBG
Dahan: Hani G Y+C 26 Y++ L+++ W- C++ T++ A+ E+ S++ V++ F- Q+++ P+ B+ PA+ PL++