Subject: problem with RAIDframe setup (my mistake) setting a root mirror
To: None <netbsd-users@NetBSD.org>
From: Carl Brewer <carl@bl.echidna.id.au>
List: netbsd-users
Date: 05/08/2006 09:05:17
Hello,

I've made some mistake with disklabel I think, trying to set up a root
mirror on a pair of 160GB IDE HDD's on an i386 box.
Fresh install of NetBSD i386 3.0.

When I run the reconstruct to bring the second disk online :

{24} raidctl -S raid0
Reconstruction is 0% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.
Reconstruction status:
  85% |**********************************     | ETA:    12:16 /

and then this in /var/log/messages :

May  8 13:45:54  /netbsd: wd0a: error reading fsbn 268435392 of 
268435392-268435519 (wd0 bn 268435455; cn 266305 tn 0 sn 15), retrying
May  8 13:45:54  /netbsd: wd0: (id not found)
May  8 13:45:54  /netbsd: wd0a: error reading fsbn 268435392 of 
268435392-268435519 (wd0 bn 268435455; cn 266305 tn 0 sn 15), retrying
May  8 13:45:54  /netbsd: wd0: (id not found)
May  8 13:45:55  /netbsd: wd0a: error reading fsbn 268435392 of 
268435392-268435519 (wd0 bn 268435455; cn 266305 tn 0 sn 15), retrying
May  8 13:45:55  /netbsd: wd0: (id not found)
May  8 13:45:56  /netbsd: wd0a: error reading fsbn 268435392 of 
268435392-268435519 (wd0 bn 268435455; cn 266305 tn 0 sn 15), retrying
May  8 13:45:56  /netbsd: wd0: (id not found)
May  8 13:45:56  /netbsd: wd0a: error reading fsbn 268435392 of 
268435392-268435519 (wd0 bn 268435455; cn 266305 tn 0 sn 15), retrying
May  8 13:45:56  /netbsd: wd0: (id not found)
May  8 13:45:57  /netbsd: wd0a: error reading fsbn 268435392 of 
268435392-268435519 (wd0 bn 268435455; cn 266305 tn 0 sn 15)wd0: (id not 
found)
May  8 13:45:57  /netbsd:
May  8 13:45:57  /netbsd: raid0: Recon read failed!
May  8 13:45:57  /netbsd: raid0: reconstruction failed.

I think I may have miscalculated my swap offset?

disklabel for all three devices is :

wd0 :

16 partitions:
#        size    offset     fstype [fsize bsize cpg/sgs]
  a: 312581745        63       RAID                     # (Cyl.      0*- 
310100)
  b:   4195233 308386526       swap                     # (Cyl. 305939*- 
310100*)
  c: 312581745        63     unused      0     0        # (Cyl.      0*- 
310100)
  d: 312581808         0     unused      0     0        # (Cyl.      0 - 
310100)
disklabel: partitions a and b overlap


wd1 :

16 partitions:
#        size    offset     fstype [fsize bsize cpg/sgs]
  a: 312581745        63       RAID                     # (Cyl.      0*- 
310100)
  b:   4195233 308386526       swap                     # (Cyl. 305939*- 
310100*)
  c: 312581745        63     unused      0     0        # (Cyl.      0*- 
310100)
  d: 312581808         0     unused      0     0        # (Cyl.      0 - 
310100)
disklabel: partitions a and b overlap


raid0 :

4 partitions:
#        size    offset     fstype [fsize bsize cpg/sgs]
  a: 308386399         0     4.2BSD   2048 16384 28832  # (Cyl.      0 - 
301158*)
  b:   4195233 308386399       swap                     # (Cyl. 301158*- 
305255*)
  d: 312581632         0     unused      0     0        # (Cyl.      0 - 
305255*)



I worked out the offset for wd0 & wd1 as follows :

raid0b offset is the raid0a size + 127 (63 + 64), so that's 308386399 

+ 127 = 308386526

I put that into the disklabel for wd0 & wd1, which then warns about
overlap, but I expect that's normal as it does overlap.  Should the
guide perhaps be updated to confirm that disklabel will warn you
about the overlap, but that it's expected?

my raid0.conf file looks like this :

START array
1 2 0

START disks
absent
/dev/wd0a

START layout
128 1 1 1

START queue
fifo 100


I followed the instructions from here, except I started with my
root disk as wd1, not wd0.  This was because I got the same error
when doing it the other way, and I wanted to see if it was maybe
a physical disk problem with wd1 that wasn't showing up in any
formats etc.

http://www.netbsd.org/guide/en/chap-rf.html#chap-rf-ex-raid1root

More info on the setup :
: {44} df -h
Filesystem    Size      Used     Avail Capacity  Mounted on
/dev/raid0a   145G     552M      137G     0%    /
kernfs        1.0K     1.0K        0B   100%    /kern
: {45} swapctl -l
Device      512-blocks     Used    Avail Capacity  Priority
/dev/raid0b    4195233        0  4195233     0%    0

Can anyone suggest what I may have done wrong? It was late at
night when I did this and I have probably missed something
important.  I can reinstall the box completely if I need to,
it's not got anything on it, and it's only time that I've
wasted :)

Thanks!

Carl