Subject: Re: 1.6.1 cgd panics
To: None <netbsd-users@netbsd.org>
From: Jorgen Lundman <lundman@lundman.net>
List: netbsd-users
Date: 11/25/2003 11:41:04
This looks quite similar to

http://mail-index.netbsd.org/netbsd-help/2003/01/27/0027.html

in terms of panic, but since I'm in 1.6.1 I assume it has already been fixed?

"I think there was a bug in the ufs_daddr_t change which could cause this.
This should be fixed now." doesn't give me much clue, can I check if my sources 
are good?

My ffs_inode.c (that contains ffs_truncate) is version:

/*      $NetBSD: ffs_inode.c,v 1.51 2001/12/18 10:57:21 fvdl Exp $      */


or, if someone can recommend a -current that has been stable for you, I can give 
that a quick go.

Lund



Jorgen Lundman wrote:
> 
> Hello,
> 
> Netbsd-1.6.1 i386
> 
> Using the backported cgd-1.6-20030912.diff, as well as my own backport 
> of nvidia IDE controller (just a matter of adding the product code) and 
> that of nvidia's ex interface.
> 
> I seem to have a panic every two days or so, and the cause appear to be 
> in the filesystem area. First one was in ffs_alloc but that is just from 
> memory, and no core saved. Lost cgd2 and cgd3 from that. Tried fsck'ing, 
> but it was a real mess, ended up in a infinite loop and would never 
> finish. Most likely a block was written un-encrypted, or one read but 
> not decrypted, which would be a somewhat less than desired thing.
> 
> Second panic left a core, most exactly:
> 
> panic: blkfree: freeing free frag
> #0  0x1 in ?? ()
> #1  0xc03711b7 in cpu_reboot ()
> #2  0xc029566e in panic ()
> #3  0xc028719d in lockmgr ()
> #4  0xc02b8448 in genfs_lock ()
> #5  0xc02b744e in VOP_LOCK ()
> #6  0xc02b6c11 in vn_lock ()
> #7  0xc02b0638 in vget ()
> #8  0xc024d767 in ffs_sync ()
> #9  0xc02b2b36 in sys_sync ()
> #10 0xc02b1b56 in vfs_shutdown ()
> #11 0xc037118f in cpu_reboot ()
> #12 0xc029566e in panic ()
> #13 0xc02421c4 in ffs_blkfree ()
> #14 0xc0244756 in ffs_truncate ()
> #15 0xc02b7745 in VOP_TRUNCATE ()
> #16 0xc025f8ec in ufs_inactive ()
> #17 0xc02b73ee in VOP_INACTIVE ()
> #18 0xc02b0742 in vput ()
> #19 0xc02631e5 in ufs_remove ()
> #20 0xc02b71a1 in VOP_REMOVE ()
> #21 0xc02b4463 in sys_unlink ()
> #22 0xc037aff3 in syscall_plain ()
> ---Type <return> to continue, or q <return> to quit---
> #23 0xc0100e74 in syscall1 ()
> can not access 0xbfbfdc54, invalid translation (invalid PDE)
> can not access 0xbfbfdc54, invalid translation (invalid PDE)
> Cannot access memory at address 0xbfbfdc54
> (gdb)
> 
> fsck'ing doesn't look good, cgd1 appears to be gone. Not sure how many 
> others (about 14 in total).
> 
> Not using softdep, just plain vanilla mount.
> 
> Is it worth trying a (stable?) -current kernel and see if things go 
> better? Can someone recommend one? Is there a know problem with the  
> version of kernel/cgd that I am running.
> 
> It is too much to expect fsck to be able to handle a fs with this level 
> of corruption but is someone interested in seeing what issues I get?
> 
> Sincerely,
> 
> Lundy
> 
> 

-- 
Jorgen Lundman       | <lundman@lundman.net>
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
Japan                | +81 (0)3 -3375-1767          (home)