Subject: 1.6.1 cgd panics
To: None <netbsd-users@netbsd.org>
From: Jorgen Lundman <lundman@lundman.net>
List: netbsd-users
Date: 11/24/2003 22:35:00
Hello,

Netbsd-1.6.1 i386

Using the backported cgd-1.6-20030912.diff, as well as my own backport of nvidia 
IDE controller (just a matter of adding the product code) and that of nvidia's 
ex interface.

I seem to have a panic every two days or so, and the cause appear to be in the 
filesystem area. First one was in ffs_alloc but that is just from memory, and no 
core saved. Lost cgd2 and cgd3 from that. Tried fsck'ing, but it was a real 
mess, ended up in a infinite loop and would never finish. Most likely a block 
was written un-encrypted, or one read but not decrypted, which would be a 
somewhat less than desired thing.

Second panic left a core, most exactly:

panic: blkfree: freeing free frag
#0  0x1 in ?? ()
#1  0xc03711b7 in cpu_reboot ()
#2  0xc029566e in panic ()
#3  0xc028719d in lockmgr ()
#4  0xc02b8448 in genfs_lock ()
#5  0xc02b744e in VOP_LOCK ()
#6  0xc02b6c11 in vn_lock ()
#7  0xc02b0638 in vget ()
#8  0xc024d767 in ffs_sync ()
#9  0xc02b2b36 in sys_sync ()
#10 0xc02b1b56 in vfs_shutdown ()
#11 0xc037118f in cpu_reboot ()
#12 0xc029566e in panic ()
#13 0xc02421c4 in ffs_blkfree ()
#14 0xc0244756 in ffs_truncate ()
#15 0xc02b7745 in VOP_TRUNCATE ()
#16 0xc025f8ec in ufs_inactive ()
#17 0xc02b73ee in VOP_INACTIVE ()
#18 0xc02b0742 in vput ()
#19 0xc02631e5 in ufs_remove ()
#20 0xc02b71a1 in VOP_REMOVE ()
#21 0xc02b4463 in sys_unlink ()
#22 0xc037aff3 in syscall_plain ()
---Type <return> to continue, or q <return> to quit---
#23 0xc0100e74 in syscall1 ()
can not access 0xbfbfdc54, invalid translation (invalid PDE)
can not access 0xbfbfdc54, invalid translation (invalid PDE)
Cannot access memory at address 0xbfbfdc54
(gdb)

fsck'ing doesn't look good, cgd1 appears to be gone. Not sure how many others 
(about 14 in total).

Not using softdep, just plain vanilla mount.

Is it worth trying a (stable?) -current kernel and see if things go better? Can 
someone recommend one? Is there a know problem with the  version of kernel/cgd 
that I am running.

It is too much to expect fsck to be able to handle a fs with this level of 
corruption but is someone interested in seeing what issues I get?

Sincerely,

Lundy


-- 
Jorgen Lundman       | <lundman@lundman.net>
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
Japan                | +81 (0)3 -3375-1767          (home)