Subject: Weird SCSI problems power related?
To: None <port-alpha@netbsd.org>
From: Kevin P. Neal <kpneal@pobox.com>
List: port-alpha
Date: 07/13/2001 23:14:14
Ok, so I have my trusty AS 200 4/233 here and a 166 MHz AXPpci33. On
both boxes I'm using the builtin SCSI (53c810) with no problems. 

On the AXPpxi33 I also have (siop1) a 53c875 ultra-scsi board. This
board was never able to boot the 200 when I had it in there, but it
appears to the SRM in the AXPpci33. Whoo. 

It's a bit of a weird board. It works just fine in a PC, with the
card's diagnostics at power-up and everything. It has onboard termination
that I have jumpered correctly as far as I can tell. The board is
recognized by Digital UNIX 4.0something sorta, but DU claims it isn't
a supported board. I don't think I had any drives hooked up to it when
I tried it with DU. 

I have two drives external, I'm using a Sun-branded external cable
that looks to be about a foot long, and I have a terminator from
scsipro.com at the end. 

Anyway, the AXPpci33 with said SCSI card is really unstable. If I don't
turn power on to the box with drives and the CPU box at the exact same
time then the drives won't function at all. I get a variety of errors
ranging from the drives not probing at all to probing but causing hangs
of the bus that prevent bootup. Well, I can turn the whole set on at
once, so that's not too terrible. Usually all is well.

What is a problem is that it appears the drives are going south, with
reports of bad blocks and so forth. This wouldn't surprise me since they
were $0.00. The reason I suspect the problem is elsewhere is because one day
one drive is flakey and the next day the other drive is flakey. One drive
will crash the box, the other drive won't come up on reboot. 

Here's the really ugly part: I'm in an old house (the attic was insulated
in 1958) and the Electrical Code pre-1962 sucks. Bad. I've got two-prong
wall outlets with one circuit for all wall outlets in the three bedrooms
total. No ground line anywhere. I'm renting, so I'm not about to pay
to have the house rewired (and the landlord said he'd do it last year,
humpf). I do at least have the whole box+drives plugged into the same
power strip, if that matters at all. 

Is it possible that not having a ground wire would cause this sort of
crappy behavior from these drives?

Currently, today, I can crash 1.5 trivially. When I did this last
the box failed to make it to multiuser mode because the OTHER drive
wouldn't talk anything after it was probed. I say trivial because
all it takes is dd if=/dev/sd1g of=/dev/null (and sd1g is an empty
partition smack in the middle of the drive). The rest of this email
is my dmesg.

panic: getblk: block size invariant failed
syncing disks... panic: lockmgr: locking against myself

dumping to dev 8,1 offset 331363
dump 
unexpected machine check:

    mces    = 0x1
    vector  = 0x670
    param   = 0xfffffc0000006048
    pc      = 0xfffffc00004e6ea8
    ra      = 0xfffffc0000305ecc
    curproc = 0xfffffc0001e4ec88
        pid = 1438, comm = dd

panic: machine check

dumping to dev 8,1 offset 331363
dump device not ready


rebooting...

Copyright (c) 1996, 1997, 1998, 1999, 2000
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 1.5 (TESSERACT) #5: Sun Apr 22 21:14:07 EDT 2001
    kpn@tome.neutralgood.org:/usr/src/sys/arch/alpha/compile/TESSERACT
Alpha PC AXPpci33, 166MHz
8192 byte page size, 1 processor.
total memory = 32768 KB
(2024 KB reserved for PROM, 30744 KB used by NetBSD)
avail memory = 25704 KB
using 204 buffers containing 1632 KB of memory
mainbus0 (root)
cpu0 at mainbus0: ID 0 (primary), LCA-2 (21066 pass 2)
lca0 at mainbus0
pci0 at lca0 bus 0
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
siop0 at pci0 dev 6 function 0: Symbios Logic 53c810 (fast scsi)
siop0: interrupting at isa irq 11
scsibus0 at siop0: 8 targets, 8 luns per target
sio0 at pci0 dev 7 function 0: vendor 0x8086 product 0x0484 (rev. 0x03)
de0 at pci0 dev 8 function 0
de0: interrupting at isa irq 9
de0: DEC DE500-BA 21143 [10-100Mb/s] pass 3.0
de0: address 08:00:2b:c4:99:0a
siop1 at pci0 dev 12 function 0: Symbios Logic 53c875 (ultra-wide scsi)
siop1: using on-board RAM
siop1: interrupting at isa irq 10
scsibus1 at siop1: 16 targets, 8 luns per target
isa0 at sio0
lc0 at isa0 port 0x300-0x31f iomem 0xd0000-0xd07ff irq 5: DE205-AC
lc0: address 00:00:f8:51:24:12, 128KB RAM, 2KB window
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard
wdc0 at isa0 port 0x1f0-0x1f7 irq 14
wd0 at wdc0 channel 0 drive 0: <QUANTUM MAVERICK 540A>
wd0: drive supports 8-sector pio transfers, lba addressing
wd0: 516 MB, 1049 cyl, 16 head, 63 sec, 512 bytes/sect x 1057392 sectors
wd0: drive supports PIO mode 3, DMA mode 1
wd1 at wdc0 channel 0 drive 1: <Maxtor 7540 AV>
wd1: drive supports 8-sector pio transfers, lba addressing
wd1: 514 MB, 1046 cyl, 16 head, 63 sec, 512 bytes/sect x 1054368 sectors
wd1: drive supports PIO mode 3, DMA mode 0
vga0 at isa0 port 0x3b0-0x3df iomem 0xa0000-0xbffff
wsdisplay0 at vga0: console (80x25, vt100 emulation), using wskbd0
lpt0 at isa0 port 0x3bc-0x3bf irq 7
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
isabeep0 at pcppi0
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
mcclock0 at isa0 port 0x70-0x71: mc146818 or compatible
scsibus0: waiting 2 seconds for devices to settle...
de0: enabling 10baseT port
sd0 at scsibus0 target 3 lun 0: <MICROP, 1548-15MZ1077801, HZ2P> SCSI1 0/direct fixed
siop0: target 3 now synchronous at 10.0Mhz, offset 8
sd0: 1637 MB, 2112 cyl, 15 head, 105 sec, 512 bytes/sect x 3353426 sectors
cd0 at scsibus0 target 5 lun 0: <YAMAHA, CRW6416S, 1.0c> SCSI2 5/cdrom removable
cd1 at scsibus0 target 6 lun 0: <DEC, RRD43   (C) DEC, 1084> SCSI2 5/cdrom removable
scsibus1: waiting 2 seconds for devices to settle...
sd1 at scsibus1 target 2 lun 0: <MICROP, 3391WS, x43h> SCSI2 0/direct fixed
siop1: target 2 using 16bit transfers
siop1: target 2 now synchronous at 20.0Mhz, offset 15
sd1: 8681 MB, 4811 cyl, 22 head, 167 sec, 512 bytes/sect x 17780058 sectors
sd2 at scsibus1 target 4 lun 0: <MICROP, 3391WS, x43h> SCSI2 0/direct fixed
siop1: target 4 using 16bit transfers
siop1: target 4 now synchronous at 20.0Mhz, offset 15
sd2: 8681 MB, 4811 cyl, 22 head, 167 sec, 512 bytes/sect x 17780058 sectors
root on sd0a dumps on sd0b
root file system type: ffs
sd2(siop1:4:0): command timeout
siop1: scsi bus reset
cmd 0xfffffe0000040318 (target 4) in reset list
cmd 0xfffffe0000040318 about to be processed
siop1: target 2 using 16bit transfers
siop1: target 2 now synchronous at 20.0Mhz, offset 15
siop1: scsi bus reset
siop0: scsi bus reset
sd2(siop1:4:0): parity error
siop1: scsi bus reset
cmd 0xfffffe0000040738 (target 4) in reset list
cmd 0xfffffe0000040738 about to be processed
siop1: target 4 using 16bit transfers
siop1: target 4 now synchronous at 20.0Mhz, offset 15
siop0: target 3 now synchronous at 10.0Mhz, offset 8
siop1: target 2 using 16bit transfers
siop1: target 2 now synchronous at 20.0Mhz, offset 15
sd2(siop1:4:0): command timeout
siop1: scsi bus reset
cmd 0xfffffe0000041028 (target 4) in reset list
cmd 0xfffffe0000041028 about to be processed
siop1: target 2 using 16bit transfers
siop1: target 2 now synchronous at 20.0Mhz, offset 15
-- 
"A method for inducing cats to exercise consists of directing a beam of
invisible light produced by a hand-held laser apparatus onto the floor ...
in the vicinity of the cat, then moving the laser ... in an irregular way
fascinating to cats,..." -- US patent 5443036, "Method of exercising a cat"