Subject: Re: crash dump failing on machine with 4GB
To: NetBSD port-sparc64 mailing list <port-sparc64@NetBSD.org>
From: Chris Ross <cross+netbsd@distal.com>
List: port-sparc64
Date: 09/27/2007 09:26:06
On Sep 26, 2007, at 14:22, Chris Ross wrote:
> On Sep 26, 2007, at 13:51, matthew green wrote:
>> can you get a stack trace with symbols?  or use gdb to
>> find them out from these values?
>
>   Of course.  Here's a backtrace after the failed "reboot 0x104"  
> used to cause the dump attempt.
>
> dumping to dev 7,1 offset 4310231
> dump 4096 esiop0: unable to load cmd DMA map: -1i/o error
> sd0(esiop0:0:0:0): polling command not done
> panic: scsipi_execute_xs
> cpu0: kdb breakpoint at 13f3e80
> Stopped in pid 0.2 (system) at  netbsd:cpu_Debugger+0x4:        nop
> db> bt
> scsipi_execute_xs(5f89c00, e0016d96, a, 0, 0, 4) at  
> netbsd:scsipi_execute_xs+0x3
> 18
> sd_flush(746fc00, 103, 0, 0, 0, 8000000000001034) at netbsd:sd_flush 
> +0x84
> sd_shutdown(746fc00, 5, 0, 0, e0016fb8, 0) at netbsd:sd_shutdown+0x18
> doshutdownhooks(161eaa8, 5, 0, 10, 1857800, f) at  
> netbsd:doshutdownhooks+0x30

   So, does anyone have any suggestions on where I should go from  
here?  I looked into the "unable to load cmd DMA map" error, which is  
returning an EIO from a call to bus_dmamap_load().  Should I try to  
track down into that function (via the macro, etc) and figure out if  
it's returning an EIO for some reason relating to the physical memory  
address it's given?  Or, can someone look at the code in doshutdown()  
to see if the physical memory mapping calls "look right"?  I was  
looking at amd64, figuring that it would be more likely to have this  
functionality working, and I notice that the pmap_* call(s) it uses  
are different, but that may not be unusual...

   Thanks.  I know not everyone has a 4GB sparc64 to play with, so  
I'm happy to work on this, but I will need to get this machine into  
production in the not-too-distant future, so need to keep moving.

   Thanks again for all help.

                                                  - Chris

ps,
   Is the last argument to sd_flush(), quoted in the backtrace above,  
indicative of a problem?  Just looks "odd" compared to the rest of  
the parameters.