Subject: Re: anyone know if there's a fix for this "malloc with held simple_lock" in RAIDframe bug yet?
To: Greg Oster <oster@cs.usask.ca>
From: Greg A. Woods <woods@weird.com>
List: tech-kern
Date: 03/16/2005 01:50:41
[ On Tuesday, March 15, 2005 at 15:46:56 (-0600), Greg Oster wrote: ]
> Subject: Re: anyone know if there's a fix for this "malloc with held simple_lock" in RAIDframe bug yet? 
>
> If you build a new 'raidctl' (actually... you might not need one, but 
> whatever) then you can use the word 'absent' as a "disk does not 
> exist" place-holder.

Yeah!  (and the old raidctl worked fine -- it contains no relevant
changes, though in the end I backported it too, mostly for the manual
page fixups....)


OK, anyway, so far, so good and everything seems to be working properly
now.

I've constructed the new RAID-1 device, populated it with copies of the
installed filesystems, booted successfully from it, and I am now
reconstructing to the original install disk.  The only weird thing is
the negative time estimate!  ;-)


[console]<@> # raidctl -v -s raid0          
Components:
           /dev/sd1a: optimal
          component1: failed
Spares:
           /dev/sd0a: spare
Component label for /dev/sd1a:
   Row: 0, Column: 0, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 1412893, Mod Counter: 106
   Clean: No, Status: 0
   sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 71131904
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Yes
   Last configured as: raid0
component1 status is: failed.  Skipping label.
/dev/sd0a status is: spare.  Skipping label.
Parity status: DIRTY
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

[console]<@> # raidctl -v -F component1 raid0
Reconstruction sraid0: vnode was NULL
RECON: initiating reconstruction on col 1 -> spare at col 2
tatus:
  0% |                                       | ETA:    -42:-536 |



And in any case "systat vm" is much more interesting to watch:

Disks:   fd0   cd0   sd0   sd1   sd2
 seeks                              
 xfers               965   964      
 bytes               60M   60M      
 %busy              99.9  78.1      


I really like those numbers!  ;-)

(now why can't the filesystem move data that fast?)

hmmm... it is slowing down, must be near the end -- and it's done:

raid0: Reconstruction of disk at col 1 completed
raid0: Recon time was 618.804143 seconds, accumulated XOR time was 0 us (0.000000)
raid0:  (start time 1110953957 sec 955365 usec, end time 1110954576 sec 759508 usec)
raid0: Total head-sep stall count was 0
raid0: 734085 recon event waits, 1 recon delays
raid0: 1110953957863819 max exec ticks


(perhaps some of those numbers are a little odd too....)


[console]<@> # raidctl -v -s raid0
Components:
           /dev/sd1a: optimal
          component1: spared
Spares:
           /dev/sd0a: used_spare
Component label for /dev/sd1a:
   Row: 0, Column: 0, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 1412893, Mod Counter: 108
   Clean: No, Status: 0
   sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 71131904
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Yes
   Last configured as: raid0
component1 status is: spared.  Skipping label.
Component label for /dev/sd0a:
   Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 1412893, Mod Counter: 108
   Clean: No, Status: 0
   sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 71131904
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Yes
   Last configured as: raid0
Parity status: clean
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

[console]<@> # df
Filesystem  1M-blocks     Used    Avail %Cap Mounted on
/dev/raid0a      1968      619     1250  33% /
/dev/raid0d      9844     5461     3889  58% /usr/pkg
/dev/raid0e     18440       43    17475   0% /var
mfs:88            969        0      920   0% /tmp
/dev/sd7a      712380        0   705256   0% /home
/dev/sd6a      716634      842   708625   0% /var/log
/dev/sd5a     1040029     1382  1028246   0% /var/spool/imap


And after one final reboot from sd0, adding the real spare, etc., all is well:

[ttyp0]<woods@newpub> # raidctl -v -s raid0           
Components:
           /dev/sd1a: optimal
           /dev/sd0a: optimal
Spares:
           /dev/sd2a: spare
Component label for /dev/sd1a:
   Row: 0, Column: 0, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 1412893, Mod Counter: 114
   Clean: No, Status: 0
   sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 71131904
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Yes
   Last configured as: raid0
Component label for /dev/sd0a:
   Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 1412893, Mod Counter: 114
   Clean: No, Status: 0
   sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 71131904
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Yes
   Last configured as: raid0
/dev/sd2a status is: spare.  Skipping label.
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.


> > Once I get to the poing of booting from the mirrored root then I'll send
> > you my diffs 
> 
> Ok.  I suspect the diffs will be quite large -- a lot of stuff has 
> changed.  Might be good to keep them around in case other folks are 
> interested in them, but I'm not sure I'd want to request a pullup of 
> that size for 1.6.x :-}  (The releng folks would probably shoot me :) )

Ah, no, I meant the diffs to -current that are necessary to do the
backport.....  (that's the only way I'll be able to maintain this change
in my own trees)

They're quite small, less than 800 lines as a unidiff, even including
some added debugging messages and a wee readme that reminds me how to
update my source trees.

As for whether it's worth doing the backport officially or not, well it
does seem essential for anyone wanting to use RAIDframe on any 1.6.x SMP
platform.  :-)

-- 
						Greg A. Woods

H:+1 416 218-0098  W:+1 416 489-5852 x122  VE3TCP  RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>          Secrets of the Weird <woods@weird.com>