Subject: Re: crash dump failing on machine with 4GB
To: Juan RP <juan@xtrarom.org>
From: Chris Ross <cross+netbsd@distal.com>
List: port-sparc64
Date: 09/28/2007 23:21:24
On Sep 28, 2007, at 17:16, Juan RP wrote:
> You are right, it's initialized in scsipi_base.c:scsipi_get_xs()...
> but perhaps the callout was stopped previously and it wasn't  
> reinitialized
> or something.
>
> Someone with scsipi(9) clue should answer :-)

   Looks like it being called from esiop_scsicmd_end() isn't  
necessarily mapped to a scsipi_get_xs(), at least in the case of a  
crashdump.

   I put debugging printf()s in _init, and _stop, and in normal  
running, am seeing lots of things like:

Calling callout_init of 0x5f89d20
Calling callout_init of 0x5f89c20
Calling callout_init of 0x5f89e20
Calling callout_stop on 0x5f89d20
Calling callout_init of 0x5f89d20
Calling callout_stop on 0x5f89c20
Calling callout_stop on 0x5f89e20
Calling callout_stop on 0x5f89d20
Calling callout_init of 0x5f89d20
Calling callout_stop on 0x5f89d20

   However, I then broke to ddb, entered "reboot 0x104", and see only:

db> reboot 0x104
Frame pointer is at 0xe0016661
Call traceback:
14101f0(1, d, 0, e00171e0, 1857800, 0, e0016731) fp = e0016731
10c1e60(104, 0, e00170a8, 1860800, 1860bc8, 188c7e8, e00167f1) fp =  
e00167f1
10c1398(1, 0, 4, e0017170, e0017298, 188c7e8, e00168c1) fp = e00168c1
10c19c8(180f318, 4, 0, 0, e0017388, 0, e0016a11) fp = e0016a11
10c537c(141a3c8, 0, 2, 1898859, 0, 0, e0016b01) fp = e0016b01
141b884(0, 0, 0, 0, 187dc00, 1000000, e0016bd1) fp = e0016bd1
14193b8(101, e0017b60, 98b3250fc, d61ae9f400000000, c700000000,  
18a4800, e0017131) fp = e0017131
1008c4c(e0017b60, 101, 141a3c0, 1d0006, 0, 1876800, e00172b1) fp =  
e00172b1
13e784c(189b990, 14dfc00, ffffffff, 0, 1818c00, 1d, e0017491) fp =  
e0017491
13e7ea8(61c4800, e0017e0c, 18245e0, 400, ffffffffffffffff, 40,  
e0017551) fp = e0017551
100914c(0, 0, e0017ed0, 18779d8, 13e7e60, 1000000, e0017621) fp =  
e0017621
129fef0(0, 161dda0, 4, 6, 187a800, 1000000, ffbd561) fp = ffbd561

dumping to dev 7,1 offset 4310231
dump Calling callout_stop on 0x187ea98
callout_stop: c 0x187ea98, c_magic 0
panic: kernel diagnostic assertion "c->c_magic == CALLOUT_MAGIC"  
failed: file "/data/NetBSD/src/sys/kern/kern_timeout.c", line 431
cpu0: kdb breakpoint at 141a3c0
Stopped in pid 0.2 (system) at  netbsd:cpu_Debugger+0x4:        nop
db>

   So, there doesn't look to be any sort of callout_init happening.   
Again, I guess we need someone with scsipi knowledge to describe why  
dev/ic/esiop.c thinks it should be callout_stop'ing that  
scsipi_xfer.  Interesting to note that all of the scsipi_xfer's that  
the callout's are called on seem to be 0x5f89..., except for the one  
here that fails.

   Even after I entered "reboot", after the above, I see:

db> reboot
syncing disks... Calling callout_init of 0x5f89d20
Calling callout_init of 0x5f89e20
Calling callout_init of 0x5f89c20
Calling callout_stop on 0x5f89d20
Calling callout_init of 0x5f89d20
Calling callout_stop on 0x5f89e20
Calling callout_init of 0x5f89e20
Calling callout_stop on 0x5f89c20
Calling callout_stop on 0x5f89d20
Calling callout_stop on 0x5f89e20
done
Calling callout_init of 0x5f89e20
Calling callout_stop on 0x5f89e20
rebooting

Resetting ...

   Wonder where the 0x187ea98 is coming from in esiop_scsicmd_end()?

                          - Chris