Subject: Re: Are tlp[01] weak for big traffic?
To: None <port-cobalt@netbsd.org>
From: NAKAJI Hiroyuki <nakaji@jp.freebsd.org>
List: port-cobalt
Date: 12/31/2006 00:17:24
>>>>> In <061221231457.M0101727@mirage.ceres.dti.ne.jp> 
>>>>>	Izumi Tsutsui <tsutsui@ceres.dti.ne.jp> wrote:

> > > Hmm, how about "tcpdump -env"?
> > 
> > I recorded with -w option. What is to be checked especially?

> Is there any packet at that time?

The recorded files are examined by tcpdump -r.

On tlp1, DNS and NBT are mainly seen in the last packets.

07:01:27.596723 IP 192.168.1.252.netbios-ns > 192.168.1.255.netbios-ns: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
07:01:29.095517 IP 192.168.1.252.2133 > 203.139.160.74.domain:  46939+ A? www.heimat.gr.jp. (34)
07:01:31.095913 IP 192.168.1.252.2133 > www.domain:  46939+ A? www.heimat.gr.jp. (34)
07:01:31.096008 IP 192.168.1.252.2133 > 203.139.161.40.domain:  46939+ A? www.heimat.gr.jp. (34)
07:01:31.096035 IP 192.168.1.252.2133 > 203.139.160.74.domain:  46939+ A? www.heimat.gr.jp. (34)
07:01:35.096698 IP 192.168.1.252.2133 > www.domain:  46939+ A? www.heimat.gr.jp. (34)
07:01:35.096783 IP 192.168.1.252.2133 > 203.139.161.40.domain:  46939+ A? www.heimat.gr.jp. (34)
07:01:35.096809 IP 192.168.1.252.2133 > 203.139.160.74.domain:  46939+ A? www.heimat.gr.jp. (34)
07:01:43.097624 IP 192.168.1.252.2135 > 203.139.160.74.domain:  49240 PTR? 1.0.0.127.in-addr.arpa. (40)
07:01:44.097492 IP 192.168.1.252.2135 > 203.139.160.74.domain:  49240 PTR? 1.0.0.127.in-addr.arpa. (40)
07:01:45.097666 IP 192.168.1.252.2135 > 203.139.160.74.domain:  49240 PTR? 1.0.0.127.in-addr.arpa. (40)

On tlp0, icmp and NBT.

07:04:26.401531 IP 60.32.13.195 > 60.32.13.193: icmp 64: echo request seq 153
07:04:27.402665 IP 60.32.13.195 > 60.32.13.193: icmp 64: echo request seq 154
07:04:27.827320 IP 60.32.13.194.137 > 60.32.13.199.137: NBT UDP PACKET(137): REGISTRATION; REQUEST; BROADCAST
07:04:27.827556 IP 60.32.13.194.137 > 60.32.13.199.137: NBT UDP PACKET(137): REGISTRATION; REQUEST; BROADCAST
07:04:28.403831 IP 60.32.13.195 > 60.32.13.193: icmp 64: echo request seq 155
07:04:29.404940 IP 60.32.13.195 > 60.32.13.193: icmp 64: echo request seq 156
07:04:29.830446 IP 60.32.13.194.137 > 60.32.13.199.137: NBT UDP PACKET(137): REGISTRATION; REQUEST; BROADCAST
07:04:29.833753 IP 60.32.13.194.137 > 60.32.13.199.137: NBT UDP PACKET(137): REGISTRATION; REQUEST; BROADCAST
07:04:30.406207 IP 60.32.13.195 > 60.32.13.193: icmp 64: echo request seq 157
07:04:31.407392 IP 60.32.13.195 > 60.32.13.193: icmp 64: echo request seq 158
07:04:31.835261 IP 60.32.13.194.137 > 60.32.13.199.137: NBT UDP PACKET(137): REGISTRATION; REQUEST; BROADCAST
07:04:31.835595 IP 60.32.13.194.138 > 60.32.13.199.138: NBT UDP PACKET(138)
07:04:31.836318 IP 60.32.13.194.138 > 60.32.13.199.138: NBT UDP PACKET(138)
07:04:31.836464 IP 60.32.13.194.138 > 60.32.13.199.138: NBT UDP PACKET(138)
07:04:31.836620 IP 60.32.13.194.138 > 60.32.13.199.138: NBT UDP PACKET(138)
07:04:31.837055 IP 60.32.13.195.138 > 60.32.13.199.138: NBT UDP PACKET(138)
07:04:31.837307 IP 60.32.13.195.138 > 60.32.13.199.138: NBT UDP PACKET(138)

> > When all tlp connections are lost, "vmstat -i" shows

> Please check if any interrupts occur during the trouble,
> not only rates/numbers. ("systat vmstat" might be better)

In first 15 seconds, it shows no cpu int 1 and 2.

    1 user     Load  0.14  0.11  0.08                  Sat Dec 30 23:10:06

Proc:r  d  s  w     Csw    Trp    Sys   Int   Sof    Flt      PAGING   SWAPPING
           8          2      1     12   105     5      2      in  out   in  out
                                                        ops
   0.0% Sy   0.0% Us   0.0% Ni   0.0% In 100.0% Id    pages
|    |    |    |    |    |    |    |    |    |    |
                                                                          forks
                                                                          fkppw
           memory totals (in kB)             105 Interrupts               fksvm
          real  virtual     free               2 soft serial              pwait
Active  114900   114900    67400               1 soft net                 relck
All     185688   185688   592060                 soft clock               rlkok
                                               2 cpu int 3                noram
Namei         Sys-cache     Proc-cache           cpu int 1                ndcpy
    Calls     hits    %     hits     %           icu irq 14               fltcp
        6        6  100                          cpu int 2                zfod
                                             100 int 5 (clock)            cow
Disks:   wd0                                                           64 fmin
 seeks                                                                 85 ftarg
 xfers                                                                    itarg
 bytes                                                               1163 wired
 %busy                                                                    pdfre


> > bash-3.2# ping 127.0.0.1
> > PING localhost (127.0.0.1): 56 data bytes
> > ping: sendto: No buffer space available
> > ping: sendto: No buffer space available
> > ping: sendto: No buffer space available
> > ping: sendto: No buffer space available

> Hmm, looks IFF_OACTIVE is set in tlp_start.
> Didn't you see any timeout error messages?
> How about ping right after "ifconfig tlp[01] up"?

No buffer space available for all IPv4. But IPv6 is available.

$ ping -n 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
ping: sendto: No buffer space available
ping: sendto: No buffer space available
ping: sendto: No buffer space available
^C
----192.168.1.1 PING Statistics----
3 packets transmitted, 0 packets received, 100.0% packet loss

$ ping -n 60.32.13.193
PING 60.32.13.193 (60.32.13.193): 56 data bytes
ping: sendto: No buffer space available
ping: sendto: No buffer space available
ping: sendto: No buffer space available
^C
----60.32.13.193 PING Statistics----
3 packets transmitted, 0 packets received, 100.0% packet loss

$ ping6 -n 2001:3e0:a84:1::1
PING6(56=40+8+8 bytes) 2001:3e0:a84:1::1 --> 2001:3e0:a84:1::1
16 bytes from 2001:3e0:a84:1::1, icmp_seq=0 hlim=64 time=1.432 ms
16 bytes from 2001:3e0:a84:1::1, icmp_seq=1 hlim=64 time=1.004 ms
16 bytes from 2001:3e0:a84:1::1, icmp_seq=2 hlim=64 time=1.024 ms
^C
--- 2001:3e0:a84:1::1 ping6 statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 1.004/1.153/1.432/0.242 ms

$ ping6 -n 2001:3e0:a84::2
PING6(56=40+8+8 bytes) 2001:3e0:a84::2 --> 2001:3e0:a84::2
16 bytes from 2001:3e0:a84::2, icmp_seq=0 hlim=64 time=1.460 ms
16 bytes from 2001:3e0:a84::2, icmp_seq=1 hlim=64 time=0.999 ms
^C
--- 2001:3e0:a84::2 ping6 statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.999/1.230/1.460/0.326 ms

$ ifconfig -a
tlp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        address: 00:10:e0:00:6f:33
        media: Ethernet autoselect (100baseTX full-duplex)
        status: active
        inet 60.32.13.193 netmask 0xfffffff8 broadcast 60.32.13.199
        inet6 fe80::210:e0ff:fe00:6f33%tlp0 prefixlen 64 scopeid 0x1
        inet6 2001:3e0:a84::2 prefixlen 64
tlp1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        address: 00:10:e0:00:6f:32
        media: Ethernet autoselect (100baseTX full-duplex)
        status: active
        inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255
        inet6 fe80::210:e0ff:fe00:6f32%tlp1 prefixlen 64 scopeid 0x2
        inet6 2001:3e0:a84:1::1 prefixlen 64
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33192
        inet 127.0.0.1 netmask 0xff000000
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3

> Are you sure that you can see any input packets on tcpdump
> but ping fails (or netstat shows no in/out packets)?

I connected Solaris PC on the tlp1 segment.

With "tcpdump -i tlp1", I can see only DNS lookup by Solaris on tlp1.
With "tcpdump -i tlp0", I can see ssh of IPv6 and ntp/snmp of IPv4.

> Anyway, it's also better to post dmesg (or kernel config
> if your kernel is not GENERIC).

Oops. I had to put it; my kernel config is,

include "arch/cobalt/conf/GENERIC"
file-system	OVERLAY
options	GATEWAY
pseudo-device	ppp
pseudo-device	tun
pseudo-device	gif

I'll try GENERIC kernel. Thanks.
-- 
NAKAJI Hiroyuki