Subject: kern/17875: H/W checksums on wm have issues
To: None <gnats-bugs@gnats.netbsd.org>
From: None <dokas@cs.umn.edu>
List: netbsd-bugs
Date: 08/07/2002 15:08:57
>Number:         17875
>Category:       kern
>Synopsis:       hardware checksums on wm (Intel GigE NIC) have issues
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Aug 07 13:09:00 PDT 2002
>Closed-Date:
>Last-Modified:
>Originator:     Paul Dokas
>Release:        NetBSD 1.6E
>Organization:
University of Minnesota, Department of Computer Science
>Environment:
System: NetBSD baldrick.cs.umn.edu 1.6E NetBSD 1.6E (BALDRICK) #0: Tue Aug  6 09:53:56 CDT 2002     root@baldrick.cs.umn.edu:/usr/src/sys/arch/i386/compile/BALDRICK i386
Architecture: i386
Machine: i386
>Description:

I'm attempting to turn on hardware IP/TCP/UDP checksumming on my
GigE firewall.  The machine has a couple of Intel GigE NICs with SX
fiber connectors:

  wm0 at pci4 dev 6 function 0: Intel i82543GC 1000BASE-X Ethernet, rev. 2
  wm0: interrupting at irq 5
  wm0: Ethernet address 00:03:47:de:e8:cb
  wm0: 1000baseSX, 1000baseSX-FDX, auto

  wm1 at pci4 dev 8 function 0: Intel i82543GC 1000BASE-X Ethernet, rev. 2
  wm1: interrupting at irq 11
  wm1: Ethernet address 00:03:47:de:e8:45
  wm1: 1000baseSX, 1000baseSX-FDX, auto


And, I've been running just fine without any hardware checksumming.  However,
today, I turned on hardware checksumming for my external NIC:

  wm0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        capabilities=7<IP4CSUM,TCP4CSUM,UDP4CSUM>
        enabled=6<IP4CSUM,TCP4CSUM,UDP4CSUM>
        address: 00:03:47:de:e8:cb
        media: Ethernet autoselect (1000baseSX full-duplex)
        status: active
        inet 128.101.80.129 netmask 0xffffffe0 broadcast 128.101.80.159
        inet6 fe80::203:47ff:fede:e8cb%wm0 prefixlen 64 scopeid 0x1


And I've started getting this:

  wm0: wm_tx_cksum: need to m_pullup, packet dropped
  wm0: device timeout (txfree 251 txsfree 59 txnext 6)
  wm0: device timeout (txfree 252 txsfree 60 txnext 4)
  wm0: device timeout (txfree 192 txsfree 0 txnext 64)
  wm0: device timeout (txfree 248 txsfree 56 txnext 8)
  wm0: device timeout (txfree 191 txsfree 0 txnext 65)
  wm0: device timeout (txfree 235 txsfree 58 txnext 21)
  wm0: device timeout (txfree 253 txsfree 61 txnext 3)
  wm0: device timeout (txfree 252 txsfree 60 txnext 4)
  wm0: device timeout (txfree 235 txsfree 57 txnext 32)

at a rate of about one message per 60 seconds.  Also, when a new
timeout message is given, the NIC appears to reset itself.
Essentially, it appears to do this:

  ifconfig wm0 down
  sleep 10
  ifconfig wm0 up


If I disable IP4CSUM, then it only gets timeouts about every 10 minutes
and usually during a burst of traffic.


The NIC is fine if I turn off all hardware checksumming.

>How-To-Repeat:

Get a machine with an Intel GigE NIC and do this:

  ifconfig wm0 ip4csum tcp4csum udp4csum

>Fix:

I have no idea.  Presumably, the hardware checksumming support is
buggy in the wm driver.
>Release-Note:
>Audit-Trail:
>Unformatted: