Subject: Measuring dropped packets
To: None <tech-net@netbsd.org>
From: Christoph Kaegi <kgc@zhwin.ch>
List: tech-net
Date: 10/26/2006 10:59:14
Hello List

Our 3.0 ipf Firewall handles several thousand users on a 40MBit/s 
link to the internet.

Now we experience delays on internet connections and certain 
applications (video conferencing) report packet loss.

How can I find out if and where packets are dropped on the firewall?
(apart from netstat -di) 

One of the observations is, that we have a quite high interrupt load
(between 10'000 to 20'000 interrupts/second at the moment with 
larger peaks, but I can't remember them anymore)

NIC's are:
---------------------------- 8< ----------------------------
wm0 at pci3 dev 2 function 0: Intel i82546EB 1000BASE-T Ethernet, rev. 1
wm0: interrupting at irq 12
wm0: 64-bit 133MHz PCIX bus
makphy0 at wm0 phy 1: Marvell 88E1011 Gigabit PHY, rev. 3
wm1 at pci3 dev 2 function 1: Intel i82546EB 1000BASE-T Ethernet, rev. 1
wm1: interrupting at irq 12
wm1: 64-bit 133MHz PCIX bus
makphy1 at wm1 phy 1: Marvell 88E1011 Gigabit PHY, rev. 3
wm2 at pci5 dev 1 function 0: Intel i82546GB 1000BASE-T Ethernet, rev. 3
wm2: interrupting at irq 11
wm2: 64-bit 66MHz PCIX bus
makphy2 at wm2 phy 1: Marvell 88E1011 Gigabit PHY, rev. 5
wm3 at pci5 dev 1 function 1: Intel i82546GB 1000BASE-T Ethernet, rev. 3
wm3: interrupting at irq 11
wm3: 64-bit 66MHz PCIX bus
makphy3 at wm3 phy 1: Marvell 88E1011 Gigabit PHY, rev. 5
wm4 at pci5 dev 2 function 0: Intel i82546GB 1000BASE-T Ethernet, rev. 3
wm4: interrupting at irq 11
wm4: 64-bit 66MHz PCIX bus
makphy4 at wm4 phy 1: Marvell 88E1011 Gigabit PHY, rev. 5
wm5 at pci5 dev 2 function 1: Intel i82546GB 1000BASE-T Ethernet, rev. 3
wm5: interrupting at irq 11
wm5: 64-bit 66MHz PCIX bus
makphy5 at wm5 phy 1: Marvell 88E1011 Gigabit PHY, rev. 5
---------------------------- 8< ----------------------------

I also get messages like:
---------------------------- 8< ----------------------------
wm2: Receive overrun
wm0: device timeout (txfree 3706 txsfree 0 txnext 22)
wm0: device timeout (txfree 3750 txsfree 0 txnext 459)
wm0: device timeout (txfree 3852 txsfree 0 txnext 352)
wm2: Receive overrun
---------------------------- 8< ----------------------------

The problems seems to primarily hurt UDP traffic, but
TCP traffic could also be affected because we use 
stateful firewalls.

It would be great, if anybody could give me the right
pointers where to look at.
Also, are there useful tools (apart from tcpdump) that
would help to diagnose such a situation?

Thanks
Chris

-- 
----------------------------------------------------------------------
Christoph Kaegi                                           kgc@zhwin.ch
----------------------------------------------------------------------