Subject: Re: Data modified on freelist...
To: Chris G. Demetriou <cgd@cs.cmu.edu>
From: Andrew Gallatin <gallatin@cs.duke.edu>
List: port-alpha
Date: 03/17/1997 13:10:05
Chris G. Demetriou writes:

 > 
 > It seems like something in the isp driver is trashing _one_ piece of
 > memory.  I don't know why.  Matt Jacob and I have discussed it a bit,
 > but he's currently on vacation.
 > 
 > The nature of this problem is pretty well-known, and it looks "mostly
 > harmless."  I'd not worry about it unless you get other messages of
 > this form.  Once Matt gets back, he might want to talk to you further
 > about the problem...

I'm sold ;-)

 > 
 > > On the same 500, I've also been seeing a few 'unexpected machine
 > > check' panics.  This 500 has some, well.. suspect dimms & Digital UNIX
 > > will occasionally kick out "Machine Check error corrected by
 > > processor" messages when ECC kicks in & corrects a single-bit error.
 > > A brief scan through arch/alpha/alpha/interrupt.c makes me think that
 > > NetBSD might panic on the same sort of interrupt, is that true?
 > 
 > Wow, yes.  NetBSD just bites the dust when it takes an unexpected
 > machine check, though for some it should just continue.
 > 
 > I could easily hack up a kernel that does the right thing, and send it
 > to you to test if you'd like.

That'd be great!  I'll be happy to test it.

A few more things I should mention:

When running the snapshot kernel, I've gotten a few
	panic: pmap_enter_ptpage: PT page not entered 

panics.   The machine locks up solid after these (though it does claim
to be syncing its disks, it needs to have its halt button pressed).  


When running a kernel built from sources supped this morning (ignore
the date mentioned below; I forgot to reset the date after rebooting
from DU), I cannot get it to go multi-user.  /bin/sh dies part-way
through /etc/rc:

pid 3 (sh): unaligned access: va=0x12010c43c pc=0x120000e90 ra=0x120000e90 op=ldq
pid 3 (sh): unaligned access: va=0x12010ad2c pc=0x120000e98 ra=0x120000e98 op=ldq
Mar 16 22:55:34 init: /bin/sh on /etc/rc terminated abnormally, going to single user mode

The first thing I was going to try was re-building sh from -current
sources, but it doesn't make sense to me that it would work fine w/a
week old kernel & die w/a brand-new one.

Thanks,

Drew
------------------------------------------------------------------------------
Duke University				Email:		gallatin@cs.duke.edu
Department of Computer Science		Phone:		(919) 660-6590