port-i386: Re: wd0d

Subject: Re: wd0d
To: None <seebs@plethora.net>
From: Sean Doran <smd@ebone.net>
List: port-i386
Date: 10/13/1999 03:29:06
| I know.  If we're going to invent even *more* dedicated use partitions, we
| really need more than 8.  Ideally, I wanted to have two or three OS's on
| the disk, and use "extra" partitions for sharing.

Invent?  Well, I am spindle-rich and would like to spread
tasks among spindles (and controllers) because I can make
various tasks fly better that way.   That and all the other
reasons for lots of partitions (compartmentalizing, protecting
against software failure, different numbers of bytes per inode,
different block/fragment sizes, and so on) apply.

Right now I have, on three disks, the following partitions:

	/, /usr, /var, /usr/pkg, /usr/local, /usr/X11R6, 
	/u, /usr/pkgsrc, /usr/src, /usr/obj.i386,
	/usr/safeplace (== $DESTDIR), /usr/xsrc, 
	and /usr/pkgsrc/distfiles

I just lost a disk (IBM DGHS09U, developed bad blocks, then
crashed crashed crashed into oblivion:

probe(isp0:0:0):  Check Condition on CDB: 0x00 00 00 00 00 00
    SENSE KEY:  Hardware Error
     ASC/ASCQ:  Internal Target Failure
     FRU CODE:  0x1
sd0 at scsibus0 targ 0 lun 0: <IBM, DGHS09U, 0350> SCSI3 0/direct fixed
sd0: drive offline


) and had to give up three CODA partitions which were fun to
play with, and a nice big MP3 crunching area.

RAID is almost attractive; I would carve up each disk into
chunks and have several RAID 5 devices spread across all 
the disks.   That would have helped me recover from losing
my /var and /usr/X11R6 partition among other things...

My fear with RAID is that a software bug takes out the
whole RAID 5 partition.  This happened this week to a hardware
RAID device; CPU went berzerk and ate lots of data in
a perfectly ordinary logical way - RAID 5 was no help there,
but multiple partitions probably would have been.

My fear is based on some early experiences with raidframe,
before lots of bugs got ironed out.   Waiting patiently for
many many minutes after a crash for the raid device to become
consistent before doing an fsck (I crash alot, sigh, and
see also the complaints about "syncing disks... 27 10 5 2 2 2 2 2... 
giving up -> dirty partitions).

	Sean.

P.S.: can one use /dev/[ws]d[0-9]c for real work on a disk which is
      totally dedicated to NetBSD?   Do you die a horrible death due
to foreseeable interaction with software that treats sd0c as "whole NetBSD
partition" or "whole disk"?