NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

RaidFrame Raid-1 problem (can't ditch a failing disk)


I have a strange problem replacing a drive from a RAID-1 RaidFrame set. Here's some info:

# uname -mrs
NetBSD 5.0_STABLE i386

# raidctl -s raid0
           /dev/sd0a: failed
           /dev/sd1a: optimal
No spares.
/dev/sd0a status is: failed.  Skipping label.
Component label for /dev/sd1a:
   Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 20071216, Mod Counter: 280
   Clean: No, Status: 0
   sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 143638784
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Yes
   Last configured as: raid0
Parity status: DIRTY
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

# dmesg | grep sd0
sd0 at scsibus0 target 0 lun 0: <ModusLnk, , > disk fixed
sd0: 70136 MB, 78753 cyl, 2 head, 911 sec, 512 bytes/sect x 143638992 sectors sd0: sync (12.50ns offset 62), 16-bit (160.000MB/s) transfers, tagged queueing
raid0: Components: /dev/sd0a[**FAILED**] /dev/sd1a

# grep smartd.*sd0d /var/log/messages |tail -3
Feb 26 00:43:04 thoth smartd[296]: Device: /dev/sd0d, opened
Feb 26 00:43:04 thoth smartd[296]: Device: /dev/sd0d, is SMART capable. Adding to "monitor" list. Feb 26 00:43:04 thoth smartd[296]: Device: /dev/sd0d, SMART Failure: HARDWARE IMPENDING FAILURE TOO MANY BLOCK REASSIGNS

So we got a bad disk and I have to change it out. So I did the following:

  o failed the component with "raidctl -f /dev/sd0a raid0"
  o shut down
  o replaced the disk
  o rebooted
  o Now the system panics right after raidframe initializes. Sorry
    I don't have the exact messages but its all raidframe stuff.
    Maybe I'll have to take a photo or something. "reboot 0x104"
    didn't seem to work.
  o power off
  o replace the "bad" sd0
  o machine boots as normal

So what gives? I verified that I'm removing the correct disk. No question; the hardware agrees, the LSI Logic bios display agrees and the scsibus/devices all agree that I'm removing the correct drive.

I also tried removing the drive and not replacing it with a new one. Still no luck there.

Any help would be great!



Home | Main Index | Thread Index | Old Index