Subject: kernel panic with 1.2-release some time after auto-changing tapes
To: None <port-i386@NetBSD.ORG>
From: Laine Stump <laine@MorningStar.Com>
List: port-i386
Date: 12/21/1996 15:46:04
We're having a problem with kernel panics on a system that has a 2940UW
controller and an Archive Python 28849 autochanger DAT drive (they
happen some time after several successful tape changes, and at least one
failed change). Has anyone else seen anything similar? Any suggestions
on what we should do? (Yes, we've already made sure the SCSI bus is
properly terminated).

Also - I tried turning off debugging in my kernel config (so the system
would just collect a coredump and reboot on panics), but the kernel
refused to build properly. Exactly what do I need to turn off in my
config to turn off kdb, while still getting buildable kernel source?

Here's what the probe of the SCSI controll and all the SCSI devices
looks like:

   ahc0 at pci0 dev 18 function 0
   ahc0: interrupting at irq 11
   ahc0: Reading SEEPROM...done.
   ahc0: aic7880 Wide Channel, SCSI Id=7, 16 SCBs
   ahc0: Reseting Channel A
   ahc0: Downloading Sequencer Program...Done
   scsibus0 at ahc0
   ahc0: target 0 using 16Bit transfers
   ahc0: target 0 synchronous at 10.0MHz, offset = 0x8
   sd0 at scsibus0 targ 0 lun 0: <SEAGATE, ST15150W, 0023> SCSI2 0/direct fixed
   sd0: 4095MB, 3712 cyl, 21 head, 107 sec, 512 bytes/sec
   ahc0:A:4: refuses WIDE negotiation.  Using 8bit transfers
   ahc0: target 4 synchronous at 5.0MHz, offset = 0xf
   probe(ahc0:4:0): Target Busy
   probe(ahc0:4:0): Target Busy
   st0 at scsibus0 targ 4 lun 0: <ARCHIVE, Python 28849-XXX, 4.CM> SCSI2 1/sequential removable
   st0: st0(ahc0:4:0): Target Busy
   st0(ahc0:4:0): Target Busy
   st0(ahc0:4:0): Target Busy
   drive empty
   probe(ahc0:4:1): Target Busy
   probe(ahc0:4:1): Target Busy
   ch0 at scsibus0 targ 4 lun 1: <ARCHIVE, Python 28849-XXX, 4.CM> SCSI2 8/changer removable
   ch0: 12 slots, 1 drive, 1 picker

And here's what the panic looks like (note this was copied from the
console by hand, and by somebody other than me - I can't vouch for the
accuracy...). This happened some time (several hours, it seems) after
someone ran a script that attempted to auto change tapes while doing tar
(I believe this script failed, but I'd have to check to be sure):

   ch0(ahc0:4:1): Target Busy
   ch0(ahc0:4:1): Target Busy
   ch0(ahc0:4:1): Target Busy
   ch0(ahc0:4:1): Target Busy
   ch0(ahc0:4:1): Target Busy
   ch0(ahc0:4:1): Target Busy
   ch0(ahc0:4:0): Target Busy
   ch0(ahc0:4:0): Target Busy
   ch0(ahc0:4:0): Target Busy
   ch0(ahc0:4:1): Target Busy
   ch0(ahc0:4:1): Target Busy
   ch0(ahc0:4:1): Target Busy
   ch0(ahc0:4:1): illegal reqest, data - 00 00 00 00 00 00 00 00 00 00
   panic: ahc0: Timed-out command times out again

   Stopped at	_Debuffer+0x4:	leave
   db> trace
   _Debugger(f8125298,f8105e6e,fb543d30,f89df000,fb543d70) at Deebugger+0x4
   _panic(f8105e6e,f8997014,f89df000,f810600c,80000000) at _panic+0x3a
   _ahc_init(f89df000) at _ahc_init+0x1510
   _softclock(f899cc40,f8b54500,54,1,fb544db0) at _softclock+0x64
   _hardclock(fb543dbc) at _clockintr+0xb
   _Xrecurse0() at _Xrecurse0+0x63
   --- interrupt ---
   idle(0,0,64,100,fb543f40) at _idle+0xd
   bpendtsleep(f8229054,118,f8126bd3,64,f82184b0) at bpendtsleep
   _sys_select(f8b54500,fb543f88,f5b43f80,0,4768) at _sys_select+0x2a9
   _syscall() at _syscall+0x260
   --- syscall (number 93) ---
   0x1004b4ef:
   db>