NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: RaidFrame Raid-1 problem (can't ditch a failing disk)
On Fri, 26 Feb 2010 01:12:12 -0500
Louis Guillaume <louis%zabrico.com@localhost> wrote:
> Hi!
>
> I have a strange problem replacing a drive from a RAID-1 RaidFrame set.
> Here's some info:
>
> # uname -mrs
> NetBSD 5.0_STABLE i386
>
> # raidctl -s raid0
> Components:
> /dev/sd0a: failed
> /dev/sd1a: optimal
> No spares.
> /dev/sd0a status is: failed. Skipping label.
> Component label for /dev/sd1a:
> Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
> Version: 2, Serial Number: 20071216, Mod Counter: 280
> Clean: No, Status: 0
> sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
> Queue size: 100, blocksize: 512, numBlocks: 143638784
> RAID Level: 1
> Autoconfig: Yes
> Root partition: Yes
> Last configured as: raid0
> Parity status: DIRTY
> Reconstruction is 100% complete.
> Parity Re-write is 100% complete.
> Copyback is 100% complete.
>
> # dmesg | grep sd0
> sd0 at scsibus0 target 0 lun 0: <ModusLnk, , > disk fixed
> sd0: 70136 MB, 78753 cyl, 2 head, 911 sec, 512 bytes/sect x 143638992
> sectors
> sd0: sync (12.50ns offset 62), 16-bit (160.000MB/s) transfers, tagged
> queueing
> raid0: Components: /dev/sd0a[**FAILED**] /dev/sd1a
>
> # grep smartd.*sd0d /var/log/messages |tail -3
> Feb 26 00:43:04 thoth smartd[296]: Device: /dev/sd0d, opened
> Feb 26 00:43:04 thoth smartd[296]: Device: /dev/sd0d, is SMART capable.
> Adding to "monitor" list.
> Feb 26 00:43:04 thoth smartd[296]: Device: /dev/sd0d, SMART Failure:
> HARDWARE IMPENDING FAILURE TOO MANY BLOCK REASSIGNS
>
>
>
> So we got a bad disk and I have to change it out. So I did the following:
>
> o failed the component with "raidctl -f /dev/sd0a raid0"
> o shut down
> o replaced the disk
> o rebooted
> o Now the system panics right after raidframe initializes. Sorry
> I don't have the exact messages but its all raidframe stuff.
> Maybe I'll have to take a photo or something. "reboot 0x104"
> didn't seem to work.
> o power off
> o replace the "bad" sd0
> o machine boots as normal
>
> So what gives? I verified that I'm removing the correct disk. No
> question; the hardware agrees, the LSI Logic bios display agrees and the
> scsibus/devices all agree that I'm removing the correct drive.
>
> I also tried removing the drive and not replacing it with a new one.
> Still no luck there.
>
> Any help would be great!
Since what you're doing seems to be correct, I think we'e going to need
a photo or backtrace or whatever of the panic in order to figure out
what's gone wrong :(
Later...
Greg Oster
Home |
Main Index |
Thread Index |
Old Index