Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)



Hello,

On 25.06.23 03:48, RVP wrote:
On Sat, 24 Jun 2023, Brian Buhrow wrote:

In any case, The fact that you're getting regular delays on your pings suggests there is a delay between the time when the arp cache times out and when it gets refreshed.


This would be determined by `net.inet.arp.nd_delay' I think (on
-HEAD).

As a consequence of that delay, if you have a high speed stream running when the cache times out, it's possible the send buffer of the sending process, i.e. sshd, is filling up before that
cache gets refreshed and the packets can flow again.


In this case, the kernel would either block the sshd process or
return EAGAIN--which is handled. The kernel should only return a
EHOSTDOWN if `net.inet.arp.nd_bmaxtries' * `net.inet.arp.nd_retrans'
(ie. 3 * 1000ms) has passed without getting an ARP response. Even
on a LAN, this is pretty unlikely (even with that peculiarly short
30-second ARP-address cache timeout). Smells like a Xen+load+timing
issue (not hand-wavy at all there, RVP!). It would be interesting
to see the tcpdump capture from the DomU.

-RVP

Over the last day I did some further tests and tried out all the hints I got in this thread. Here is s short summary:

1) Run a ping over night from DomU to Dom0 -> no dropouts

2) increased the ARP cache timeout  net.inet.arp.nd_reachable=1200000
   on both, Dom0 and DomU  -> this seemed to have an effect at first,
   but the problem still exists (its not a measured fact but a feeling,
   that it happens now a bit less often and later)

3) Checked send/receive buffer configuration

```
srv-net$ sysctl net.inet.tcp.sendbuf_auto
net.inet.tcp.sendbuf_auto = 1
srv-net$ sysctl net.inet.tcp.recvbuf_auto
net.inet.tcp.recvbuf_auto = 1
srv-net$ sysctl net.inet.tcp.sendbuf_max
net.inet.tcp.sendbuf_max = 262144
srv-net$ sysctl net.inet.tcp.recvbuf_max
net.inet.tcp.recvbuf_max = 262144
```

These samples are from DomU, but Dom0 has an identical configuration.

4) Run the test with tcpdump from DomU -> this is currently ongoing. I will followup as soon I have the results.


Kind regards
Matthias

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature



Home | Main Index | Thread Index | Old Index