Subject: Re: Hung TCP connections through wi0 and NAT?
To: None <dyoung@pobox.com>
From: Paul Ripke <stixpjr@ozemail.com.au>
List: tech-net
Date: 01/06/2003 16:30:19
On Monday, January 6, 2003, at 08:26 AM, David Young wrote:

> You know, this could be wi(4) losing. Weren't you having panics in
> wi_read_bap? Please send me your whole dmesg, and ifconfig wi0, for
> my collection.

Yup, panics PR kern/19605, in wi_read_bap while under "other"
interrupt load. One of possibly three unique panics I'm looking at.

Dmesg:
NetBSD 1.6K (STIX-PC) #29: Mon Dec 30 22:02:40 EST 2002
     stix@stix-pc.stix.org.au:/usr/src/sys/arch/i386/compile/STIX-PC
total memory = 127 MB
avail memory = 114 MB
using 1658 buffers containing 6632 KB of memory
BIOS32 rev. 0 found at 0xfb140
mainbus0 (root)
cpu0 at mainbus0: (uniprocessor)
cpu0: Intel Pentium III (686-class), 548.57 MHz, id 0x673
cpu0: features 387f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR>
cpu0: features 387f9ff<PGE,MCA,CMOV,PAT,PSE36,PN,MMX>
cpu0: features 387f9ff<FXSR,SSE>
cpu0: I-cache 16 KB 32b/line 4-way, D-cache 16 KB 32b/line 4-way
cpu0: L2 cache 512 KB 32b/line 4-way
cpu0: ITLB 32 4 KB entries 4-way, 2 4 MB entries fully associative
cpu0: DTLB 64 4 KB entries 4-way, 8 4 MB entries 4-way
cpu0: serial number 0000-0673-0001-46F5-B1F0-B431
cpu0: 32 page colors
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: Intel 82443BX Host Bridge/Controller (rev. 0x03)
agp0 at pchb0: aperture at 0xe0000000, size 0x4000000
ppb0 at pci0 dev 1 function 0: Intel 82443BX AGP Interface (rev. 0x03)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
vga0 at pci1 dev 0 function 0: 3Dfx Interactive Banshee (rev. 0x03)
wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation)
wsmux1: connecting to wsdisplay0
pcib0 at pci0 dev 7 function 0
pcib0: Intel 82371AB PCI-to-ISA Bridge (PIIX4) (rev. 0x02)
pciide0 at pci0 dev 7 function 1: Intel 82371AB IDE controller (PIIX4) 
(rev. 0x01)
pciide0: bus-master DMA support present
pciide0: primary channel wired to compatibility mode
wd0 at pciide0 channel 0 drive 0: <ST38410A>
wd0: drive supports 32-sector PIO transfers, LBA addressing
wd0: 8223 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 16841664 
sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66)
wd1 at pciide0 channel 0 drive 1: <ST317221A>
wd1: drive supports 32-sector PIO transfers, LBA addressing
wd1: 16446 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 33683328 
sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66)
pciide0: primary channel interrupting at irq 14
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using 
DMA data transfers)
wd1(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using 
DMA data transfers)
pciide0: secondary channel wired to compatibility mode
atapibus0 at pciide0 channel 1: 2 targets
cd0 at atapibus0 drive 1: <CREATIVE CD5233E, MT1198 B Firmware, C1.00> 
cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2
wd2 at pciide0 channel 1 drive 0: <QUANTUM FIREBALLlct15 30>
wd2: drive supports 16-sector PIO transfers, LBA addressing
wd2: 28629 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 58633344 
sectors
wd2: 32-bit data port
wd2: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66)
pciide0: secondary channel interrupting at irq 15
wd2(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using 
DMA data transfers)
cd0(pciide0:1:1): using PIO mode 4, DMA mode 2 (using DMA data 
transfers)
uhci0 at pci0 dev 7 function 2: Intel 82371AB USB Host Controller 
(PIIX4) (rev. 0x01)
uhci0: interrupting at irq 10
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
Intel 82371AB Power Management Controller (PIIX4) (miscellaneous 
bridge, revision 0x02) at pci0 dev 7 function 3 not configured
wi0 at pci0 dev 8 function 0: Intersil Prism2.5 Wireless Lan
wi0: interrupting at irq 11
wi0: 802.11 address 00:05:5d:5b:c5:f5
wi0: using RF:PRISM2.5 MAC:ISL3874A(Mini-PCI)
wi0: Intersil Firmware: Primary (1.0.5), Station (1.3.4)
wi0: supported rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
siop0 at pci0 dev 9 function 0: Symbios Logic 53c810a (fast scsi)
siop0: interrupting at irq 9
scsibus0 at siop0: 8 targets, 8 luns per target
tlp0 at pci0 dev 10 function 0: DECchip 21140A Ethernet, pass 2.2
tlp0: interrupting at irq 9
tlp0: Ethernet address 00:80:c8:27:f1:b6
lxtphy0 at tlp0 phy 0: LXT970 10/100 media interface, rev. 1
lxtphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ohci0 at pci0 dev 11 function 0: NEC USB Host Controller (rev. 0x41)
ohci0: interrupting at irq 10
ohci0: OHCI version 1.0
usb1 at ohci0: USB revision 1.0
uhub1 at usb1
uhub1: NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 3 ports with 3 removable, self powered
ohci1 at pci0 dev 11 function 1: NEC USB Host Controller (rev. 0x41)
ohci1: interrupting at irq 11
ohci1: OHCI version 1.0
usb2 at ohci1: USB revision 1.0
uhub2 at usb2
uhub2: NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
NEC USB Host Controller (USB serial bus, interface 0x20, revision 0x02) 
at pci0 dev 11 function 2 not configured
isa0 at pcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pms0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pms0 mux 0
lpt0 at isa0 port 0x378-0x37b irq 7
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
isapnp0: read port 0x203
sb0 at isapnp0 port 0x220/16,0x330/2,0x388/4 irq 5 drq 1,3
sb0: Creative ViBRA16X PnP Audio: dsp v4.16
audio0 at sb0: half duplex, mmap, independent
mpu0 at sb0
midi1 at mpu0: SB MPU-401 MIDI UART
opl0 at sb0: model OPL3
midi2 at opl0: SB Yamaha OPL3
joy0 at isapnp0 port 0x201/1
joy0: Creative ViBRA16X PnP Game
joy0: joystick not connected
apm0 at mainbus0: Power Management spec V1.2
Kernelized RAIDframe activated
IPsec: Initialized Security Association Processing.
scsibus0: waiting 2 seconds for devices to settle...
ulpt0 at uhub1 port 1 configuration 1 interface 0
ulpt0: Canon BJC-2100SP, rev 1.00/1.02, addr 2, iclass 7/1
ulpt0: using bi-directional mode
cd1 at scsibus0 target 0 lun 0: <RICOH, CD-R/RW MP7060S, 1.70> cdrom 
removable
cd1: async, 8-bit transfers
st0 at scsibus0 target 1 lun 0: <SONY, SDT-5000, 3.26> tape removable
st0: drive empty
st0: async, 8-bit transfers
sd0 at scsibus0 target 2 lun 0: <SyQuest, SQ5200C, 3CE4> disk removable
sd0: drive offline
sd0: sync (200.0ns offset 8), 8-bit (5.000MB/s) transfers
st1 at scsibus0 target 5 lun 0: <QUANTUM, DLT7000, 1624> tape removable
st1: density code 25, variable blocks, write-enabled
st1: sync (100.0ns offset 8), 8-bit (10.000MB/s) transfers
boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs
IP Filter: v3.4.29 initialized.  Default = pass all, Logging = enabled
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)
wsdisplay0: screen 5 added (80x25, vt100 emulation)
wsdisplay0: screen 6 added (80x25, vt100 emulation)
wsdisplay0: screen 7 added (80x25, vt100 emulation)

ifconfig wi0:
wi0: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu 
1500
         nwid xxxx
         powersave off
         address: 00:05:5d:5b:c5:f5
         media: IEEE802.11 autoselect hostap (DS2 hostap)
         status: active
         inet 192.168.254.129 netmask 0xffffff80 broadcast 
192.168.254.255
         atalk 200.93 range 200-299 phase 2 broadcast 200.93

> Apply this patch, which protects against bogus frame lengths.

With the patch applied, I managed to get one instance of the debug
message while reading a SCSI tape... the system didn't crash, an
improvement:
     Jan  6 15:09:07 stix-pc /netbsd: wi_rx_intr: oversized packet
I'm still wondering if two interrupt routines aren't tripping each
other up...

ITOH Yasufumi just posted a similar panic to tech-kern, but to me, it
looks different enough to be a different problem:
  <http://news.gw.com/netbsd.tech.kern/22771>

Cheers,
--
Paul Ripke