Subject: Re: ypbind hangs as of current from midday yesterday (kern+user)
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Sarton O'Brien <bsd-xen@roguewrt.org>
List: current-users
Date: 06/27/2007 09:54:22
On Wed, 27 Jun 2007 02:17:26 am Manuel Bouyer wrote:
> On Tue, Jun 26, 2007 at 01:08:37PM +1000, Sarton O'Brien wrote:
> > Since upgrading to yesterdays current my NIS server (xen domu) won't
> > start ypbind.
> >
> > I have tried reinitializing the yp database and files with no change. All
> > the other NIS relevant processes are starting fine. NFS is working better
> > than ever.
> >
> > What can I do to provide more details? I've not had to debug a process
> > before so any information that can enable me to help would be
> > appreciated.
>
> Can you ping your ypbind client when this happens ?
> I would first start with tcpdump on both lo0 and the network interface ...

It is the server I am trying to start ypbind on. When initiating ypbind via an 
ssh session, I can ping the server and the ssh session I am in is fine but I 
can't ssh in from another console. As soon as I ^C the ssh client in the 
other console logs in.

Loopback is seeing:

09:44:50.049092 IP (tos 0x0, ttl  64, id 21366, offset 0, flags [none], 
length: 164, bad cksum 0 (->28d1)!) localhost.65458 > localhost.sunrpc: UDP, 
length: 136
09:44:50.049290 IP (tos 0x0, ttl  64, id 21367, offset 0, flags [none], 
length: 148, bad cksum 0 (->28e0)!) localhost.exp1 > localhost.1020: UDP, 
length: 120
09:44:50.049429 IP (tos 0x0, ttl  64, id 21368, offset 0, flags [none], 
length: 56, bad cksum 0 (->293b)!) localhost.1020 > localhost.exp1: [bad udp 
cksum b3df!] UDP, length: 28
09:44:50.049453 IP (tos 0x0, ttl  64, id 21369, offset 0, flags [none], 
length: 164, bad cksum 0 (->28ce)!) localhost.65458 > localhost.sunrpc: UDP, 
length: 136
09:44:50.049591 IP (tos 0x0, ttl  64, id 21370, offset 0, flags [none], 
length: 148, bad cksum 0 (->28dd)!) localhost.exp1 > localhost.1020: UDP, 
length: 120
09:44:50.049716 IP (tos 0x0, ttl  64, id 21371, offset 0, flags [none], 
length: 56, bad cksum 0 (->2938)!) localhost.1020 > localhost.exp1: [bad udp 
cksum 73df!] UDP, length: 28
09:44:56.099756 IP (tos 0x0, ttl  64, id 21488, offset 0, flags [none], 
length: 164, bad cksum 0 (->2857)!) localhost.65458 > localhost.sunrpc: UDP, 
length: 136
09:44:56.099934 IP (tos 0x0, ttl  64, id 21489, offset 0, flags [none], 
length: 148, bad cksum 0 (->2866)!) localhost.exp1 > localhost.1020: UDP, 
length: 120
09:44:56.100066 IP (tos 0x0, ttl  64, id 21490, offset 0, flags [none], 
length: 56, bad cksum 0 (->28c1)!) localhost.1020 > localhost.exp1: [bad udp 
cksum 33df!] UDP, length: 28
09:44:56.100089 IP (tos 0x0, ttl  64, id 21491, offset 0, flags [none], 
length: 164, bad cksum 0 (->2854)!) localhost.65458 > localhost.sunrpc: UDP, 
length: 136
09:44:56.100221 IP (tos 0x0, ttl  64, id 21492, offset 0, flags [none], 
length: 148, bad cksum 0 (->2863)!) localhost.exp1 > localhost.1020: UDP, 
length: 120
09:44:56.100320 IP (tos 0x0, ttl  64, id 21493, offset 0, flags [none], 
length: 56, bad cksum 0 (->28be)!) localhost.1020 > localhost.exp1: [bad udp 
cksum f3de!] UDP, length: 28
09:45:02.159762 IP (tos 0x0, ttl  64, id 22181, offset 0, flags [none], 
length: 164, bad cksum 0 (->25a2)!) localhost.65458 > localhost.sunrpc: UDP, 
length: 136
09:45:02.159939 IP (tos 0x0, ttl  64, id 22182, offset 0, flags [none], 
length: 148, bad cksum 0 (->25b1)!) localhost.exp1 > localhost.1020: UDP, 
length: 120
09:45:02.160071 IP (tos 0x0, ttl  64, id 22183, offset 0, flags [none], 
length: 56, bad cksum 0 (->260c)!) localhost.1020 > localhost.exp1: [bad udp 
cksum b3de!] UDP, length: 28

And with daily output telling me this in dom0 (daily from domu is fine):

network:
netstat: kvm_read: Bad address
Name            Ipkts  Ierrs        Opkts  Oerrs  Colls

I'm getting the impression it's network card related.

A bit more info:

uname -a&&pkg_info|grep xen
NetBSD gogeta.internal 4.99.21 NetBSD 4.99.21 (XEN3_DOM0) #4: Mon Jun 25 
05:04:37 EST 2007  
root@spike.internal:/usr/obj/sys/arch/i386/compile/XEN3_DOM0 i386
xenkernel3-3.1.0    Xen3 Kernel
xentools3-3.1.0     Userland Tools for Xen

ifconfig -a
bge0: flags=8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 
1500
        
capabilities=3f80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx>
        enabled=0
        address: 00:13:72:18:02:ad
        media: Ethernet autoselect (100baseTX 
full-duplex,flowcontrol,rxpause,txpause)
        status: active
        inet 192.168.210.10 netmask 0xffffff00 broadcast 192.168.210.255
        inet6 fe80::213:72ff:fe18:2ad%bge0 prefixlen 64 scopeid 0x1

I could start poking but this probably looks obvious to someone else. Should I 
disable hardware csums?

Thanks,

Sarton