Subject: wm0 receive overrun after adding lsi fibre channel card.
To: None <port-i386@netbsd.org>
From: Jonathan Kay <jpk@panix.com>
List: port-i386
Date: 02/24/2005 10:29:54
Hello all,

  I have a lovely server that I've been running a various stable -current
on for a few years now--it's a dell poweredge 600sc with the ServerWorks
SL chipset, intel gigabit onboard & a 3ware raid card.  It has served as
a fairly heavy-duty file server & various other not-too-taxing stuff.

  I've not had any problems with it, until we needed a lot more space &
got an Xserver RAID to attach--I've been running into lots of problems
with it.  I got an apple fibre channel PCI-X card (LSI 929X) which worked
beautifully with the Xserve RAID on another machine, with an Intel 850
chipset (I think). (and a 3com 10/100 card)

  When the xserve raid is in the 600sc there were several issues--first
off the non-ACPI kernel I had didn't work--it crashed when I was dumping
from the 3ware raid to the Xserve RAID..  Finally with an ACPI kernel
everything started to look good, but now about twice or four times a day
I get

wm0: Received overrun

and the machine stop responding to pings for a ~10 seconds--it can still
ping itself, but can't ping out.  it's light on the switch stays on.

  These might be related to lots of traffic coming through the ethernet,
but I'm not 100% sure about this..  (sometimes it seems to happen when
there is very little traffic goingon)

There are also some oddities with samba traffic that I haven't had before..

  I'm not sure what is going on...  any input is definitely welcomed--the
kernel is a patched version from Jason Thorpe to get the kernel to
recognise the full size of the raid array.  The kernel was really stable
in the machine w/ the Intel 850 chipset.
  Any better suggestions on really nice gigabit cards? I've always been
pretty happy with intel's "wm".. "gsip"s never really worked super well
for me.. 

Thank you for any input!
Jonathan

here are some snippets from dmesg:

NetBSD 2.99.10 (CLUB-UNIX.MP) #4: Sat Feb 19 14:03:55 EST 2005
jpk@club-unix.clubhouse.local:/usr/src/20041124/src/sys/arch/i386/compile/
CLUB-UNIX.MP
..
ioapic0 at mainbus0 apid 2 (I/O APIC)
ioapic0: pa 0xfec00000, version 11, 16 pins
ioapic0: misconfigured as apic 0
ioapic0: remapped to apic 2
ioapic1 at mainbus0 apid 3 (I/O APIC)
ioapic1: pa 0xfec01000, version 11, 16 pins
ioapic1: misconfigured as apic 0
ioapic1: remapped to apic 3
ioapic2 at mainbus0 apid 4 (I/O APIC)
ioapic2: pa 0xfec02000, version 11, 16 pins
ioapic2: misconfigured as apic 0
ioapic2: remapped to apic 4
acpi0 at mainbus0
acpi0: using Intel ACPI CA subsystem version 20040211
acpi0: X/RSDT: OemId <DELL  ,PE600SC ,00000001>, AslId <MSFT,0100000a>
acpi0: SCI interrupting at int 9
...
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: ServerWorks CMIC-SL PCI/AGP bridge (rev. 0x32)
pchb1 at pci0 dev 0 function 1
pchb1: ServerWorks CMIC-SL PCI/AGP bridge (rev. 0x00)
wm0 at pci0 dev 2 function 0: Intel i82540EM 1000BASE-T Ethernet, rev. 2
wm0: interrupting at ioapic1 pin 1 (irq 10)
wm0: 32-bit 33MHz PCI bus
wm0: 64 word (6 address bits) MicroWire EEPROM
wm0: Ethernet address 00:c0:9f:22:a9:51
makphy0 at wm0 phy 1: Marvell 88E1011 Gigabit PHY, rev. 3
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, 
auto
ohci0 at pci0 dev 3 function 0: NEC USB Host Controller (rev. 0x43)
ohci0: interrupting at ioapic1 pin 2 (irq 5)
...
ehci0 at pci0 dev 3 function 2: NEC USB Host Controller (rev. 0x04)
ehci0: interrupting at ioapic1 pin 2 (irq 5)
...
mpt0 at pci0 dev 5 function 0: LSI Logic FC929X FC Adapter
mpt0: interrupting at ioapic1 pin 7 (irq 5)
mpt0: Port 0: Link state Failed
mpt0: External Bus Reset
mpt0: Port 0: FC Link Event: LIP(f8,f7) (Loop Initialization)
mpt0:   Device detected loop failure before acquiring AL_PA
mpt0: Port 0: Link state Active
mpt0: Rescan Port 0
scsibus0 at mpt0: 256 targets, 8 luns per target
mpt1 at pci0 dev 5 function 1: LSI Logic FC929X FC Adapter
mpt1: interrupting at ioapic1 pin 8 (irq 10)
scsibus1 at mpt1: 256 targets, 8 luns per target
twe0 at pci0 dev 7 function 0: 3ware Escalade
twe0: interrupting at ioapic1 pin 11 (irq 5)
twe0: 4 ports, Firmware FE7X 1.05.00.036, BIOS BE7X 1.08.00.044
twe0: Monitor ME7X 1.01.00.035, PCB Rev3    , Achip V3.20   , Pchip V1.30   
..
rccide0 at pci0 dev 14 function 0
rccide0: ServerWorks CSB6 RAID/IDE Controller (rev. 0xa0)
rccide0: bus-master DMA support present
rccide0: primary channel configured to native-PCI mode
rccide0: using ioapic0 pin 11 (irq 11) for native-PCI interrupt
atabus0 at rccide0 channel 0
rccide0: secondary channel wired to native-PCI mode
atabus1 at rccide0 channel 1
pchb2 at pci0 dev 15 function 0
pchb2: ServerWorks CSB6 southbridge (rev. 0xa0)
...
atapibus0 at atabus0: 2 targets
cd0 at atapibus0 drive 0: <SAMSUNG CD-ROM  SC-148C, , B105> cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
cd0(rccide0:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33) (using 
DMA)
sd0 at scsibus0 target 0 lun 0: <APPLE, Xserve RAID, 1.24> disk fixed
sd0: 1117 GB, 143080 cyl, 128 head, 128 sec, 512 bytes/sect x 2344222720 sectors
sd1 at scsibus0 target 0 lun 1: <APPLE, Xserve RAID, 1.24> disk fixed
sd1: 1117 GB, 143080 cyl, 128 head, 128 sec, 512 bytes/sect x 2344222720 sectors
boot device: ld0
...