Subject: Re: HotSwapping SCSI Disks with RAIDFrame
To: Christoph Kaegi <kgc@zhwin.ch>
From: Greg A. Woods <woods@weird.com>
List: netbsd-users
Date: 07/09/2003 03:31:04
[ On Wednesday, July 9, 2003 at 08:34:59 (+0200), Christoph Kaegi wrote: ]
> Subject: HotSwapping SCSI Disks with RAIDFrame
>
> When putting it back in, the kernel said:
> -------------------------------------- 8< --------------------------------------
> Jul  9 07:34:39 mx2 /netbsd: eset channel A
> Jul  9 07:34:40 mx2 /netbsd: ahc0: Someone reset channel A
> Jul  9 07:34:40 mx2 last message repeated 271 times
> -------------------------------------- 8< --------------------------------------
> 
> Is it normal, that the scscibus is reset on insertion of a new
> disk?

It is "normal" if your disks are not really "hot" swappable.

Many RAID vendors talk about "hot" swap as if you can just move disks
around with wild abondon.  However unless you have some specific
hardware and/or software support that will protect the bus then you
really shouldn't be moving things around while the system is running.

Kingston, now StorCase, make true hot-swap bays that have SCSI isolator
boards on the back.  When you turn the key on the front the isolator
waits for a pause in the traffic on the bus then it uses an electronic
switch to disconnect the bay from the bus.  At that point it changes the
LED display on the front of the bay to indicate that the drive may be
safely removed (and optionally unlocks the solenoid lock if the bay has
such a lock option installed).  Similarly when a drive is inserted and
the key locked again the bus isn't attached to it until there's a gap in
the bus traffic.  The use of an electronic switch also eliminates all
possibility of noise caused by the physical insertion or extration of
the connectors.  They're not cheap, but they really do work (though now
I think at least three of my ten have failed, though they're quite a few
years old now)

As I understand it most other RAID systems that claim to have hot-swap
support really mean that they have a software function which will allow
the operator to hold the device bus quiescent while a drive is connected
or disconnected.  (The CMD CRD-5000 based RAID systems I have use a
separate SCSI host adapter and bus for each disk I currently have
installed in them so they can be hot-swapped at leisure, assuming they
are not active components of any RAID set.  :-)

I.e. make sure you drop your system into the kernel debugger before you
start pulling disks from, or adding disks to, your SCSI bus!  ;-)

Electrically I believe it's safer to pull a disk assuming it either has
an SCA-II connector, or is in a bay which has the equivalent features
(power and ground make-first, break-last).  Inserting a disk means
adding a tap and a stub to the data transmission lines and that can
cause problems, especially with the faster Ultra-160 and Ultra-320
speeds.  I did some experimentation with an ASUS server with hot-swap
SCA-II bays & Ultra-320 Cheetah drives and I was never able to insert a
drive when there was any activity whatsoever on the bus without causing
some kind error.

> The new disk didn't seem to be detected automatically (I couldn't
> disklabel it) so I tried to 
> 
>   scsictl scsibus0 scan any any
> 
> to get the system to recognise it. Then it said:
> -------------------------------------- 8< --------------------------------------
> Jul  9 07:57:30 mx2 /netbsd: panic: ahc_action: not tagged and device busy
> Jul  9 07:57:30 mx2 /netbsd: syncing disks... ahc0: WARNING no command for scb 29 (cmdcmplt)
> Jul  9 07:57:30 mx2 /netbsd: QOUTPOS = 198
> -------------------------------------- 8< --------------------------------------
> 
> ... and ciao. The systm just rebooted.

That probably should have worked but unless you're running with the new
AHC driver in -current it's quite possible the earlier reset noise got
the driver into a bad state and it was bound to crash anyway.

Maybe you should have tried a manual reset of the bus first....

-- 
								Greg A. Woods

+1 416 218-0098;            <g.a.woods@ieee.org>;           <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>