Subject: Re: ahc bug in current? (was: ccd changed in current?)
To: Michael L. VanLoon -- HeadCandy.com <michaelv@mindbender.serv.net>
From: Jason Thorpe <thorpej@nas.nasa.gov>
List: current-users
Date: 08/28/1997 18:39:32
On Thu, 28 Aug 1997 18:29:59 -0700 
 "Michael L. VanLoon -- HeadCandy.com" <michaelv@MindBender.serv.net> wrote:

 > The ccd driver seems to be exonerated for now... :-)

*Whew*  :-)

 > Both running parallel fsck's, and parallel dd's locked up the system
 > tight.  I didn't even have ccd configured during the tests, so it's
 > not a factor.

You are just the person I'm looking for :-)

Here's the scoop: I have noticed that problem on an important production
system, but can't debug it there since I can't have it crashing all the 
time, so it's running an older kernel.

I can't reproduce this problem on my i486 EISA system that has an ahc
in it.

I have two theories:

	(a) The system gets stuck due to an unbalanced splbio() call.

	(b) The ahc driver sets up some sort of PCI DMA parameter that
	    loses with 486 PCI chipsets.  (Note, I have not heard any
	    reports of this problem on Pentiums or greater, and have
	    not been able to reproduce the problem on an AlphaServer
	    8200 with an ahc connected to several RAID boxes doing
	    performance measurements).

Given that the only two reported incidents happen on 486 PCI systems, I
am inclined to go with (b).

Justin, were there any changes related to this?

 > Back to the hardware.  A 2940UW and a dual-channel 3940UW, with five
 > SCSI hard drives, plus a jaz drive, spread across them (I used all six
 > drives during the parallel-dd test).  There is also a SCSI CD-ROM and
 > a SCSI DAT driver, which aren't really involved.
 > 
 > It should be noted that because there are so many controllers, plus a
 > SMC EtherPower PCI (DEC 21140 chip), that there are at least two PCI
 > interrupts being shared.  It should also be noted that this doesn't
 > cause any problems under NetBSD-1.2.
 > 
 > Any more ideas?  Bug in the ahc driver?  Bug with sharing PCI-PCI
 > interrupts, and/or PCI bridges (there are two of them in the system)?
 > 
 > To give more info on the installed hardware, here's the dmesg output
 > from booting my 1.2+ kernel:
 > 
 > NetBSD 1.2 (MINDBENDER) #417: Tue Mar 11 21:47:49 PST 1997
 >     michaelv@MindBender.serv.net:/u/src/sys/arch/i386/compile/MINDBENDER
 > CPU: i486DX (486-class CPU)
 > real mem  = 66715648
 > avail mem = 53395456
 > using 2430 buffers containing 9953280 bytes of memory
 > mainbus0 (root)
 > isa0 at mainbus0
 > com0 at isa0 port 0x3e8-0x3ef irq 9: ESP, 1024 byte fifo
 > com1 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
 > com2 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
 > lpt0 at isa0 port 0x378-0x37f: polled
 > lpt1 at isa0 port 0x278-0x27f: polled
 > npx0 at isa0 port 0xf0-0xff: using exception 16
 > vt0 at isa0 port 0x60-0x6f irq 1: unknown s3, 80 col, color, 9 scr, mf2-kbd, [R3.32]
 > fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
 > fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
 > pci0 at mainbus0 bus 0: configuration mode 1
 > vendor Intel, unknown product 0x122d (class bridge, subclass host, revision 0x02) at pci0 dev 0 function 0 not configured
 > vendor Intel, unknown product 0x122e (class bridge, subclass ISA, revision 0x02) at pci0 dev 7 function 0 not configured
 > de0 at pci0 dev 9 function 0: DC21140 [10-100Mb/s] pass 1.2
 > de0: Ethernet address 00:00:c0:87:87:e5
 > de0: enabling 10baseT UTP port
 > de0: interrupting at irq 15
 > ahc0 at pci0 dev 10 function 0
 > ahc0: interrupting at irq 12
 > ahc0: Reading SEEPROM...done.
 > ahc0: aic7880 Wide Channel, SCSI Id=7, 16 SCBs
 > ahc0: Reseting Channel A
 > ahc0: Downloading Sequencer Program...Done
 > scsibus0 at ahc0
 > ahc0: target 0 synchronous at 10.0MHz, offset = 0xf
 > ahc0: target 0 Tagged Queuing Device
 > sd0 at scsibus0 targ 0 lun 0: <COMPAQPC, DPES-30540, S31K> SCSI2 0/direct fixed
 > sd0: 511MB, 4901 cyl, 2 head, 106 sec, 512 bytes/sec
 > ahc0: target 2 synchronous at 4.0MHz, offset = 0xf
 > cd0 at scsibus0 targ 2 lun 0: <SONY, CD-ROM CDU-55S, 1.0t> SCSI2 5/cdrom removable
 > ahc0: target 3 synchronous at 10.0MHz, offset = 0xf
 > ahc0: target 3 Tagged Queuing Device
 > sd1 at scsibus0 targ 3 lun 0: <iomega, jaz 1GB, J.83> SCSI2 0/direct removable
 > sd1: sd1(ahc0:3:0): illegal request, data = 03 9a 01 37 d6 00 00 00 20 20 00 00 00 00 00 00 00
 > sd1: could not mode sense (4); using fictitious geometry
 > 1021MB, 1021 cyl, 64 head, 32 sec, 512 bytes/sec
 > ahc0: target 4 synchronous at 5.0MHz, offset = 0xf
 > st0 at scsibus0 targ 4 lun 0: <ARCHIVE, IBM4326NP/RP  !D, 4.AC> SCSI2 1/sequential removable
 > st0: drive empty
 > ahc0: target 5 synchronous at 10.0MHz, offset = 0xf
 > ahc0: target 5 Tagged Queuing Device
 > sd2 at scsibus0 targ 5 lun 0: <HP, C3323-300, 5011> SCSI2 0/direct fixed
 > sd2: 1003MB, 2982 cyl, 7 head, 98 sec, 512 bytes/sec
 > ppb0 at pci0 dev 11 function 0: Digital Equipment DECchip 21050 PCI-PCI Bridge (rev. 0x02)
 > pci1 at ppb0 bus 1
 > ahc1 at pci1 dev 4 function 0
 > ahc1: interrupting at irq 10
 > ahc1: Reading SEEPROM...done.
 > ahc1: aic7880 Wide Channel, SCSI Id=7, 16 SCBs
 > ahc1: Reseting Channel A
 > ahc1: Downloading Sequencer Program...Done
 > scsibus1 at ahc1
 > ahc1: target 6 synchronous at 10.0MHz, offset = 0xf
 > ahc1: target 6 Tagged Queuing Device
 > sd3 at scsibus1 targ 6 lun 0: <SEAGATE, ST31200N, 8648> SCSI2 0/direct fixed
 > sd3: 1006MB, 2700 cyl, 9 head, 84 sec, 512 bytes/sec
 > ahc2 at pci1 dev 5 function 0
 > ahc2: interrupting at irq 12
 > ahc2: Reading SEEPROM...done.
 > ahc2: aic7880 Wide Channel, SCSI Id=7, 16 SCBs
 > ahc2: Reseting Channel A
 > ahc2: Downloading Sequencer Program...Done
 > scsibus2 at ahc2
 > ahc2: target 5 synchronous at 10.0MHz, offset = 0xf
 > ahc2: target 5 Tagged Queuing Device
 > sd4 at scsibus2 targ 5 lun 0: <HP, C3323-300, 5011> SCSI2 0/direct fixed
 > sd4: 1003MB, 2982 cyl, 7 head, 98 sec, 512 bytes/sec
 > ahc2: target 6 synchronous at 10.0MHz, offset = 0xf
 > ahc2: target 6 Tagged Queuing Device
 > sd5 at scsibus2 targ 6 lun 0: <SEAGATE, ST31200N, 8630> SCSI2 0/direct fixed
 > sd5: 1006MB, 2700 cyl, 9 head, 84 sec, 512 bytes/sec
 > S3 968 (class display, subclass VGA, revision 0x00) at pci0 dev 12 function 0 not configured
 > biomask 1440 netmask 9440 ttymask 965a
 > sd1(ahc0:3:0): illegal request, data = 0c 29 01 37 d6 00 00 00 ff ff 00 00 00 00 00 00 00
 > sd1: could not mode sense (4); using fictitious geometry
 > 
 >  Aperture driver for XFree86 version 1.5
 > 
 > 
 > -----------------------------------------------------------------------------
 >   Michael L. VanLoon                           michaelv@MindBender.serv.net
 >       Contract software development for Windows NT, Windows 95 and Unix.
 >              Windows NT and Unix server development in C++ and C.
 > 
 >         --<  Free your mind and your machine -- NetBSD free un*x  >--
 >     NetBSD working ports: 386+PC, Mac 68k, Amiga, Atari 68k, HP300, Sun3,
 >         Sun4/4c/4m, DEC MIPS, DEC Alpha, PC532, VAX, MVME68k, arm32...
 >     NetBSD ports in progress: PICA, others...
 > -----------------------------------------------------------------------------

Jason R. Thorpe                                       thorpej@nas.nasa.gov
NASA Ames Research Center                            Home: +1 408 866 1912
NAS: M/S 258-6                                       Work: +1 415 604 0935
Moffett Field, CA 94035                             Pager: +1 415 428 6939