Subject: Re: Data modified on freelist...
To: Andrew Gallatin <gallatin@cs.duke.edu>
From: Chris G. Demetriou <cgd@cs.cmu.edu>
List: port-alpha
Date: 03/17/1997 12:46:04
> On my AS 500/266, I've been seeing the following warning when booting
> a recent generic kernel (netbsd-GENERIC-970312).  This message
> generally shows up just after init starts:
> 
> Data modified on freelist: word 0 of object 0xfffffe004a599300 size 48 previous type ??? (0x4a6ab0 != 0xdeadbeef)
> 
> Should I be worried?  Looking at the comment in kern/kern_malloc.c, it
> looks like this means there might be 'memory reuse problems'

I ran into the same thing on a '600 on friday, and spent a good
portion of the day tracking it down (including writing a little malloc
'event' log tracker 8-).

It seems like something in the isp driver is trashing _one_ piece of
memory.  I don't know why.  Matt Jacob and I have discussed it a bit,
but he's currently on vacation.

The nature of this problem is pretty well-known, and it looks "mostly
harmless."  I'd not worry about it unless you get other messages of
this form.  Once Matt gets back, he might want to talk to you further
about the problem...


> On the same 500, I've also been seeing a few 'unexpected machine
> check' panics.  This 500 has some, well.. suspect dimms & Digital UNIX
> will occasionally kick out "Machine Check error corrected by
> processor" messages when ECC kicks in & corrects a single-bit error.
> A brief scan through arch/alpha/alpha/interrupt.c makes me think that
> NetBSD might panic on the same sort of interrupt, is that true?

Wow, yes.  NetBSD just bites the dust when it takes an unexpected
machine check, though for some it should just continue.

I could easily hack up a kernel that does the right thing, and send it
to you to test if you'd like.



chris