Subject: Re: RAIDframe questions
To: Holger Weiss <holger@weiss.in-berlin.de>
From: Chris Ross <cross+netbsd@distal.com>
List: port-sparc64
Date: 09/21/2007 23:35:46
On Sep 19, 2007, at 22:43, Holger Weiss wrote:
> * Chris Ross <cross+netbsd@distal.com> [2007-09-19 22:01]:
>>  I'll figure out how to cause a crash dump on a sparc64 via serial
>>  console.  :-)
>
> send a BREAK in order to drop into DDB and then use 'sync'.
>
> The main reason for my mail is that I'm interested in whether the dump
> works for you and if so, how long it took :-)  I set up RAIDframe on
> sparc64 a few weeks ago.  Works just fine for me, but I gave up on the
> crash dump after it dumped for about 15 hours (while printing out  
> '4096'
> continuously).  My Ultra 80 has 4 GB of memory, but it shouldn't take
> _that_ long.  Maybe it failed on syncing the buffer cache, I haven't
> tried 'reboot 0x104'[*] yet.  If that doesn't help I'll ask the list,
> but in any case I'd be interested in whether the crash dump works for
> you :-)

   Hi.  I find now that I've reconfigured my crash-dump partitions to be
large enough, that I have that same problem.  I have effectively the
same machine you do (Ultra-80 / Ultra E420r) with 4GB of memory.

   An attempt to force a crash dump caused:

[halt sent]
cpu0: kdb breakpoint at 128f8e0
Stopped at      netbsd:cpu_Debugger+0x4:        nop
db> sync
syncing disks... panic: switch wchan
cpu0: kdb breakpoint at 128f8e0
Stopped at      netbsd:cpu_Debugger+0x4:        nop
db> sync
Frame pointer is at 0x1089a441
Call traceback:
12859b0(ffffffffffffffff, d, 0, 1446000, 1089afb0, 0, 1089a511) fp =  
1089a511
10a0d70(100, 0, 144f0f4, 144f000, 144f0f0, 144f0cc, 1089a5d1) fp =  
1089a5d1
10a1240(1089b068, 0, 1, 1089af48, 73, 73000000, 1089a691) fp = 1089a691
10a16b4(140b800, 13098a0, 0, 0, 1089b158, b, 1089a7e1) fp = 1089a7e1
10a4cdc(128f8e8, 0, 15ad9c00, 13f293800, 1089bba800, 0, 1089a8d1) fp  
= 1089a8d1
1290c60(0, 0, 0, 0, ffff, 1000000, 1089a9a1) fp = 1089a9a1
128e758(101, 1089b920, 0, a, 1089bad8, 148a800, 1089af01) fp = 1089af01
1008b34(1089b920, 101, 128f8e0, 1d0006, 0, 1, 1089b071) fp = 1089b071
11dea4c(13f2938, 1089bba8, 10897ad8, 0, 1282700, a, 1089b251) fp =  
1089b251
1009fa4(14055aa, 104, 2, 0, 140b800, 1483b30, 1089b321) fp = 1089b321
11cb188(fc99540, 0, 0, 1, 1, 4, 1089b3d1) fp = 1089b3d1
11cb678(fc99540, 0, 1, 0, 2, 2, 1089b4a1) fp = 1089b4a1
10b6b14(0, 10, 13b7b98, 0, 6d6b108, 1483a90, 1089b561) fp = 1089b561
100a22c(6d6b000, 0, 5db1000, 400, 400, 4, 1089b621) fp = 1089b621
0(0, 0, 0, 0, 0, 0, 0) fp = 0

dumping to dev 7,1 offset 4191191
dump 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096  
4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096  
4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096  
4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096  
4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096  
4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096  
4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096  
4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096  
4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096  
4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096  
4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096  
4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096  
4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096  
4096 4096 [.....]

    So, there's the issue of the first "panic: switch wchan" that I've
mentioned earlier, and also the fact that it appears that a sparc64
running 4.0_RC1 isn't able to crash dump, at least with 4 GB of
memory.  Anyone [on port-sparc64] know why this is?  Is this a
known bug?  Certainly something I'd like to see fixed!!!

   Thanks....

                                - Chris