Subject: Re: Upgrading to -current
To: None <port-sun3@NetBSD.ORG>
From: der Mouse <mouse@Holo.Rodents.Montreal.QC.CA>
List: port-sun3
Date: 08/27/1996 11:11:51
> I'd be interested to see if the 3-chip-SIMM vs 9-chip-SIMM problems
> that used to plague a few Sun-3 platforms under SunOS could cause
> spurious bugs under NetBSD.

Well, the -3/150 that I've seen it happen on uses VME memory.
The -3/260 I've not seen it happen on uses VME memory.
Hmmm.

>> It appears to have something to do with shared libraries; an
>> executable that uses no shared libraries will (invariably, in my
>> experience) work, even when "everything dumps core".
> A Sun-3 dependency in mmap() ?   That might also explain why it seems
> to occur with greater frequency in times of high memory demand...

Yeah.  I heard some speculation about an in-core page of a shared
library ending up with the wrong data.  I wonder if it might be worth
writing something to run through and invalidate all in-core pages of
everything.  It'll kill the system with pageins for the next few
moments, but even that is a lot less disruptive than a reboot.  It'll
also provide useful diagonstic information based on whether it actually
cures the problem or not.

> It seems to be uid-dependent too: Although the first core dump can
> result from virtually any process, I've found that only processes
> owned by root (or setuid root) dump core once things start falling
> over.

That's very interesting.  I hadn't noticed, because when it strikes
I've always been working as root.  (I wonder if this is why it hasn't
struck on the -3/260 yet - because very little of the work done there
has been done as root.)  Sometime on a setup where it happens I should
try a "make build" as a joe user and see if it strikes.

					der Mouse

			    mouse@collatz.mcrcim.mcgill.edu
		    01 EE 31 F6 BB 0C 34 36  00 F3 7C 5A C1 A0 67 1D