Subject: Re: ddb help
To: Eduardo Horvath <eeh@netbsd.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-sparc64
Date: 10/26/2004 00:17:16
On Mon, Oct 25, 2004 at 06:46:53PM +0000, Eduardo Horvath wrote:
> On Sun, Oct 24, 2004 at 10:19:02PM +0200, Manuel Bouyer wrote:
> > Hi,
> > I can't understand how this can happen. Is it possible that ddb is printing
> > the wrong address here, or is missing a function call in the stack frame ?
> > This is a current GENERIC32 kernel, recompiled with -g
> 
> Stack traces are done by traversing the register windows saved to the stack
> and printing out the linkage pointers.  It is possible that the register 
> windows were never saved to the stack, they were overwritten, the stack
> pointer is pointing to the wrong place, or there have been some tail calls
> a$nd the bottom register window has been recycled.  In this instance is 
> most likely the latter.
> 
> > 
> > netbsd:wdc_ata_bio_start+0x484: stb             %g0, [%l0 + 0xa]
> > db> 
> > netbsd:wdc_ata_bio_start+0x488: or              %g0, %i0, %o0
> > db> 
> > netbsd:wdc_ata_bio_start+0x48c: call            netbsd:wdc_ata_bio_done
> > db> 
> 
> Here's a call to netbsd:wdc_ata_bio_done.  It probably calls something
> else just before returning, so that call never got its own stack frame.

Yes, it probably is. The end of wdc_ata_bio_done() is:
	(*chp->ch_drive[drive].drv_done)(chp->ch_drive[drive].drv_softc);
	printf()
	atastart(chp);
}

It could be drv_done being NULL, but it would mean that the memory was
corrupted.

> [...]
> 
> 1) Enable traptrace.  It should give you a better idea of the calling sequence.
> 
> 2) If you can find the end of the stack, dump out the bottom trapframe.  You
> might get a better idea of the machine state from that.

OK, I'll try that. Thanks for the course.
Unfortunably I lost this ddb session (ddb did hang and I had to power-cycle),
and could reproduce this. I do get others strange behaviors, though.

I also found (the hard way, that is, in my hands) that I have a power issue
in this area; there is ~80V between the 2 grounds of 2 different power
sockets. One is for the U5, the other powers the ethernet switch to which
the U5 is connected. I have to investigate, but it could be the cause of
the strange behavior of the U5 (I tried 2 different) I connect to this
socket ...

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--