Subject: Re: PATA disk drive not being configured in 2.0.2
To: None <netbsd-help@netbsd.org, port-alpha@netbsd.org>
From: Dieter <netbsd@sopwith.solgatos.com>
List: port-alpha
Date: 06/11/2005 09:57:29
> Well, I'm out of idea on this one. It seems to use the correct bus_space
> handle, it even has the same aligneemnt as the one used for the satalink.
>
> Does anyone more familiar with the alpha architecture have an idea what's
> happening here ? What does the machine check below exactly mean ?
...
> This cmdide controller is the one from the motherboard, right ?

Yes.

> If it's an add-on PCI card, it would be possible that not all PCI address
> lines are wired up (this has been seen with some promise controllers), but
> if it's the one from the motherboard I would expect it to be wired
> properly.

Properly?  Well.... there is the "Vector allocation failed" thing.
http://mail-index.NetBSD.org/port-alpha/2005/02/04/0003.html
DEC says it is okay, but to my way of thinking the factory shouldn't
be putting out stuff with error messages.

It was working in at least 1.6.2 and 2.0.  (Other than some unreadable
sectors which I assume are the drive's fault rather than the controller.
Maybe that's a bad assumption?  Drive is less than a year old and shouldn't
be having so many bad sectors.)  I did a clean shutdown to add the SATA
controller and disk.  Upon powerup I got a bunch of new "Vector allocation
failed" messages.  Grrr.  Removed a couple of boards I wasn't using and
swapped boards around until I was down to the one factory warning again.

So now kernels without cmdide don't configure the PATA disk.
Kernels with cmdide trap if a PATA drive is attached.
(I wasted a day or 2 swapping things around playing "let's see if
*this* disk will boot", while figuring this out.  Pulling the
SATA controller out didn't help, that was of course one of the first
things I tried.)

Put the PATA drive in a USB-to-PATA box.  Works okay until it gets a read
error (I assume), then even the simplest command to the disk takes about
25 minutes, and always fails.  Power cycle the USB box and the kernel hangs.
Great fun having to do reboot every time you get a read error.  My latest normal
(non-debug) kernel is working much better though.  Best guess is that
taking out cmdide fixed the USB, but I haven't tested that yet.

It appears that some sectors got bad data written to them.
Fsck seems to have made things worse rather than better.  :-(
So now I'm dd-ing partitions over to the SATA drive and working
with a copy, to avoid digging the hole even deeper.

So much for my plan to add a SATA drive to mirror the PATA one
and increase reliability.