Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)



	hello.  My understanding is that the arp caching mechanism works regardless of whether
you use static MAC addresses or dynamically generated ones.  The reason is that arp bridges the
gap between the layer 2 network, i.e. the MAC addresses, and the layer 3 network, i.e. the IP
addresses those MAC addresses map to.  You can demonstrate this interaction by shutting down
the vif interface to your domu, then delete the MAC address from the arp cache for that vif by
using arp -d <MAC address>, then by trying to ping your domu from dom0.  After about 20
seconds, you should see the host is down message.  Then, use arp -a to look for your domu's IP
address.   what you'll see in the MAC field is the word "incomplete".  
If you then run brconfig on the bridge containing the domu, you'll see the MAC  address you
assigned, or which was assigned dynamically, alive and well.

	My guess is that you're runing into some sort of short term memory crunch inside the
dom0's network stack.  The long term ping test should provide more details about where this
memory crunch might be.  The long time favorite variable for this issue is the good ole
nmbclusters value, tunable in the kernel config and visible through:
/sbin/sysctl kern.mbuf.nmbclusters
Although it's a blunt instrument, the output from:
netstat -m
might be helpful as well.  specifically, the value listed as the number of calls to protocol
drain routines.

	Yet another possibility is if you have a firewall set up , either on the dom0, or on the
domu in question.  If you're running into some rule that restricts access or bandwidth on the
path between the dom0 and the domu, you might see this kind of behavior.  Unfortunately, in my
experience, when one runs into a firewall issue of this nature, the error messaging around it
is very misleading.  It's important to remember that the IP stacks on the dom0 or domu,
respectively, don't know that the IP address for the machine at the other end of the connection
is actually running on the same hardware.  Consequently, if there are firewall rules set up on
either dom0 or the domu in question, and, possibly both, be sure your firewall rules provide
full access between the dom0 and domu in question, just as you would if you were writing rules
for remote machines.

the fact that you're only seeing this problem when communicating between the dom0 and the domu,
and not between the domu and the rest of the world, suggests to me the problem is on the dom0,
so I would start by looking there first.

Hope these notes help.
-Brian




Home | Main Index | Thread Index | Old Index