Subject: Raid Problems (URGEND)
To: NetBSD Users <netbsd-users@netbsd.org>
From: Uwe Lienig <uwe.lienig@fif.mw.htw-dresden.de>
List: netbsd-users
Date: 08/29/2006 15:38:25
Hello alpha fellows,

the problem with this RAID came, when 2 components of a raid array (with netbsd
raid frame - software raid) failed within a short time.

OS specific infos:
netbsd 1.6.2

hardware:
DEC3000/300
2xtcds scsi adapters

The system is off site and I don't have direct access, but I can phone the site
to issue commands on the console.

Excerpt from boot message
/netbsd: DEC 3000 - M300, 150MHz, s/n
/netbsd: 8192 byte page size, 1 processor.
/netbsd: total memory = 256 MB
/netbsd: tcds0 at tc0 slot 4 offset 0x0: TurboChannel Dual SCSI (baseboard)
/netbsd: asc0 at tcds0 chip 0: NCR53C94, 25MHz, SCSI ID 7
/netbsd: scsibus0 at asc0: 8 targets, 8 luns per target
/netbsd: tcds2 at tc0 slot 1 offset 0x0: TurboChannel Dual SCSI
/netbsd: tcds2: fast mode set for chip 0
/netbsd: asc3 at tcds2 chip 0: NCR53C96, 40MHz, SCSI ID 7
/netbsd: scsibus3 at asc3: 8 targets, 8 luns per target
/netbsd: tcds2: fast mode set for chip 1
/netbsd: asc4 at tcds2 chip 1: NCR53C96, 40MHz, SCSI ID 7
/netbsd: scsibus4 at asc4: 8 targets, 8 luns per target
/netbsd: tcds1 at tc0 slot 0 offset 0x0: TurboChannel Dual SCSI
/netbsd: tcds1: fast mode set for chip 0
/netbsd: asc1 at tcds1 chip 0: NCR53C96, 40MHz, SCSI ID 7
/netbsd: scsibus1 at asc1: 8 targets, 8 luns per target
/netbsd: tcds1: fast mode set for chip 1
/netbsd: asc2 at tcds1 chip 1: NCR53C96, 40MHz, SCSI ID 7
/netbsd: scsibus2 at asc2: 8 targets, 8 luns per target
/netbsd: scsibus0: waiting 2 seconds for devices to settle...
/netbsd: sd10 at scsibus3 target 0 lun 0: <IBM, DDYS-T18350N, S96H> SCSI3
0/direct fixed
/netbsd: sd10: 17501 MB, 15110 cyl, 6 head, 395 sec, 512 bytes/sect x 35843670
sectors
/netbsd: sd10: sync (100.0ns offset 15), 8-bit (10.000MB/s) transfers, tagged
queueing
:
( then infos for sd11, sd12, sd13, sd30, sd31, sd32, sd33, hard wired in the
   kernel config to these scsi devices )

There is a disk (sd0) for the OS, a 4GByte Barracuda. The data is stored in a
raid5. The raid consists of 6 identical IBM DDYS-T18350 (sd1[0-2], sd3[0-2])
plus a spare configured into the raid (sd13b) and a cold spare (sd33) (total 8
disks).

Everything worked ok for two years now. But during weekend 26/27 aug 2006 two
disks (sd30 and sd31) failed.

Prior to failure the raid config was as follows.
Original config (raid0)

START array
1 6 1

START disks
/dev/sd10b
/dev/sd11b
/dev/sd12b
/dev/sd30b
/dev/sd31b
/dev/sd32b

START spare
/dev/sd13b

After the raid was initially created two years ago, auto config was switched on.

raidctl -A yes /dev/raid0

Step 1
--------------------------------

When sd30 (about 12 hours before sd31) and sd31 failed, the system went down.
After that the raid couldn't be configured any more, raid0 was missing.

Due to the failure of 2 disks the raid couldn't manage to come up again. raidctl
-c failed (incorrect modification counter)

First I tried to get the raid going by reconfiguring via

raidctl -C /etc/raid0.conf raid0

After that the raid came up and a /dev/raid0 was accessible. I had the hope that
the read errors of sd31 would not persist and tried to fail sd30.

raidctl -F /dev/sd30b raid0

This caused a panic since sd31 produced hard errors again.
_____________________________________________________________

Step 2
------------------------------------

To get the raid going again, I decided to copy sd31 to sd33 (the cold spare).
This would allow the raid to come up since there will be no hard errors. To copy
I used (all disks are identical)

dd if=/dev/rsd31c bs=1b conv=noerror,sync of=/dev/rsd33c

I know, that there will be some blocks with wrong infos in them (dd will produce
blocks filled with null bytes on read errors). sd30 remains as failed. But sd31
will not produce read errors anymore. Thus the building of the raid will succeed.

Then I edited /etc/raid0.conf and changed sd31 to sd33 looking as

START disks
/dev/sd10b
/dev/sd11b
/dev/sd12b
/dev/sd30b
# changed sd31 to sd33
/dev/sd33b
/dev/sd32b

I didn't change the spare line.

After a reboot the raid came up correctly and was configured automagically.
Since all the filesystems that where on the raid were commented out the raid
remained untouched after configuration.

raidctl -s /dev/raid0

showed

            /dev/sd10b: optimal
            /dev/sd11b: optimal
            /dev/sd12b: optimal
            /dev/sd30b: failed
            /dev/sd31b: optimal
            /dev/sd32b: optimal
            spares: no spares
and
            Parity status: dirty
            Reconstruction is 100% complete.
            Parity Re-write is 100% complete.
            Copyback is 100% complete.


Two questions: why is sd31 not replaced by sd33? Why is there no spare? Where is
sd13 gone? raidctl -F /dev/sd30b raid0 didn't succeed due to the immediate panic
in step 1.
_____________________________________________________________

Step 3
-------------------------------------------------------------

I was sure that sd13 wasn't used so i added sd13 again:

raidctl -a /dev/sd13b /dev/raid0

Then I initiated reconstruction again
raidctl -F /dev/sd13b /dev/raid0

The system paniced again.
_____________________________________________________________

Step 4
-------------------------------------------------------------
After reboot the system configured the raid. Now I have

raidctl -s /dev/raid0

/dev/sd10b: optimal
/dev/sd11b: optimal
/dev/sd12b: optimal
/dev/sd13b: optimal
/dev/sd31b: failed
/dev/sd32b: optimal
spares: no spares

Where is sd33, why has sd31 failed? sd31 was replaced by sd33 in the config file
and should be optimal.

Now I tried to check the file system on the raid although the raid is not
completely functional.

fsck -n -f /dev/raid0{a,b,d,e,f}

Some file systems have more, some less erros. Basically it seems normal from the
file system point of view. But I don't know what state the raid is in?

I'm stuck at this point as to what to do know. I really like to get the raid
going again. I've ordered new drives already. But I'd like to bring the raid
back into a state that will allow for correct operation again without
reconstructing everything from scratch. Yes, I have a backup on tape (although
not the newest one since the last backup on Friday 25th prior to this crash
didn't made it). So the backup is from two weeks ago.

I see this as a test case for dealing with those errors on raid sets.

Since this is a file server I have to manage that the system gets up again as
quick as possible.

Thank you all for your input.

-- 


Uwe Lienig
----------
fon: (+49 351) 462 2780
fax: (+49 351) 462 3476
mailto:uwe.lienig@fif.mw.htw-dresden.de

Forschungsinstitut Fahrzeugtechnik
<http://www.fif.mw.htw-dresden.de>
parcels: Gutzkowstr. 22, 01069 Dresden
letters: PF 12 07 01,    01008 Dresden

Hochschule für Technik und Wirtschaft Dresden (FH)
Friedrich-List-Platz 1, 01069 Dresden