Subject: Re: How stable is NetBSD on a sun3?
To: None <mcr@sandelman.ocunix.on.ca>
From: Gordon W. Ross <gwr@mc.com>
List: port-sun3
Date: 06/04/1996 10:59:37
> Date: Tue, 04 Jun 1996 01:20:43 -0400
> From: Michael Richardson <mcr@sandelman.ocunix.on.ca>

>   Okay, so *how* do we debug the shared library problems????

It's rather difficult.  Here is what I've been trying:

	(1) Compile a kernel with DDB and PMAP_DEBUG options.

	(2) Put a ddb breakpoint in trapsignal() and find out
	what virtual address the process is core dumping on.
	Using that, try to figure out which page has the wrong
	data.  It is usually not an address in the backtrace,
	but by examining the user-space library data, you may
	be able to identify the page that has wrong data.

	(3) Run the kernel with pmap_db_watchva set to the
	virtual page address that typically has wrong data.
	Record (remote console) all pmap changes for that
	virtual address, and try to figure out which one is
	wrong.  (and why 8^)

My suspicion is that there is a missing interrupt protection 
somewhere, causing reentrance in some unfortunate place.

I thought I was getting close, then the bug dissappeared again...

If you can catch it "red-handed" in unwanted recursion, save
the stack backtrace so we can figure out where the missing
interrupt protection was.

Gordon