Subject: RAIDframe crash
To: None <current-users@netbsd.org>
From: Kazushi (Jam) Marukawa <jam@pobox.com>
List: current-users
Date: 09/15/2001 03:32:54
Hi,

This time I made a backup of my RAID5 and tested it to
reproduce a crash caused by RAIDframe which I told in my old
mail.  I could reproduce it with clean file system.  So, I
guess RAIDframe has some problem in unconfigure or configure
section.  Please check them if it's possible.  Thanks.

I used Aug 17 or 18's kernel.  So, this problem may be
already fixed, though.  Indeed, I experienced strange lock
up problem with this kernel which is already solved in
today's kernel.


What I did is basically to unconfigure an exist raid
partition and reconfigure it from scratch.  Actual commands
what I executed are:

 $ raidctl -u raid0
 $ raidctl -C raid0.conf raid0
 $ raidctl -I xxxxxx raid0
 $ raidctl -i raid0
 $ disklabel -w raid0 labelXXX
 $ newfs raid0
 $ mount /dev/raid0e /mnt
 $ cd usr
 $ tar cf - . | (cd /mnt; tar xfp -)
 uvm_fault(0xc06015a0, 0x0, 0, 1) -> e
 kernel: page fault trap, code=0
 Stopped in pid 245 (raid) at    VOP_STRATEGY+0x1f:      movl          0x3c(%eax),%eax
 db> trace
 VOP_STRATEGY(c0c8a15c,c0c6a8e4,c0bc0700,3fff10b9,6280) at VOP_STRATEGY+0x1f
 rf_DispatchKernelIO(c0c6a8e4,c0bc0700,c0c65750,c0ba3980,c0bc2000) at rf_DispatchKernelIO+0x1ca
 rf_DiskIOEnqueue(c0c6a8e4,c0bc0700,1,c0bc2000,0) at rf_DiskIOEnqueue+0x1cd
 rf_DiskReadFuncForThreads(c0bc2000,c0bc2000,d3e4dd10,c01d039c,c0bc2000) at rf_DiskReadFuncForThreads+0x13c
 FireNode(c0bc2000) at FireNode+0x4a
 FireNodeList(c0bc2360,c0ca96c0,0,0,1) at FireNodeList+0x158
 PropagateResults(c0ca96c0,0,c0ca96c0,d3e4dd7c,c01d09a0) at PropagateResults+0x324
 ProcessNode(c0ca96c0,0,d3e4dd8c,c01c2a00,c0ca96c0) at ProcessNode+0xbd
 rf_FinishNode(c0ca96c0,0,d3e4dd9c,c01d0072,c0ca96c0) at rf_FinishNode+0x18
 rf_NullNodeFunc(c0ca96c0,c0bbc710,d3e4ddb4,c01d0230,c0ca96c0) at rf_NullNodeFunc+0x14
 FireNode(c0ca96c0) at FireNode+0x4a
 FireNodeArray(1,c0bbc710,0,c0c7a900,3fff10b9) at FireNodeArray+0x158
 rf_DispatchDAG(c0bbc700,c01f2c00,c0c7a900) at rf_DispatchDAG+0xf1
 rf_State_ExecuteDAG(c0bb9600,0,c0bb34e8,0,0) at rf_State_ExecuteDAG+0x14f
 rf_ContinueRaidAccess(c0bb9600) at rf_ContinueRaidAccess+0x9a
 rf_ReleaseStripeLock(c0c6a000,c5,0,c0bb4ce8) at rf_ReleaseStripeLock+0xf6d
 rf_State_Cleanup(c0bb9400,c0bb9400,1,2,0) at rf_State_Cleanup+0x374
 rf_ContinueRaidAccess(c0bb9400,0,c0cc9bd0,3fff10b9,c0bb9448) at rf_ContinueRaidAccess+0xaa
 rf_ContinueDagAccess(c0c7a700,c0b94000,c01d0aa4,0,c0b94184) at rf_ContinueDagAccess+0x168
 DAGExecutionThread(c0b94000) at DAGExecutionThread+0x14c
 db> c
 uvm_fault(0xc06015a0, 0x0, 0, 1) -> e
 kernel: page fault trap, code=0
 Stopped in pid 245 (raid) at    VOP_STRATEGY+0x1f:      movl          0x3c(%eax),%eax
 db> p %eax
 c0234253

If I once reboot the system just after the newfs, everything
worked fine.

-- Kazushi