Hello, I need some help trying to even understand where to begin to look for a solution for a network problem I've been having. The network looks something like this ISP ^ | v ZyXEL modem in bridged mode ^ | v wm0 Soekris net6501 running NetBSD/amd64 netbsd-6 branch ^ wm1 ^ wm2 ^ wm3 | | | ----- | ------ | | | switch switch WiFi ||| ||| Access Computers Media Point Center Some relevant rc.conf variables: ------------------------------------ auto_ifconfig=YES ifconfig_wm1="inet 192.168.72.1 netmask 255.255.255.0" ifconfig_wm2="inet 192.168.92.1 netmask 255.255.255.0" ifconfig_wm3="inet 192.168.124.1 netmask 255.255.255.0" dhclient=YES dhclient_flags="wm0" dhcpd=YES dhcpd_flags="-q wm1 wm2 wm3" ipfilter=YES ipnat=YES ------------------------------------ I went from using the router-part of the ZyXEL modem, to bridging it and letting the Soekris-box handle the higher-level stuff. When I set it up, it Just Worked (well, more or less). Because I have a dynamic address, my ipnat.conf and ipf.conf are generated from ipnat.conf.template and ipf.conf.template which simply have "$EXT" strings in them which I sed to the appropriate values. Anywho -- I set everything up, and it worked like a charm for several months. At times, I unplugged the system (when there's a risk of thunderstorms/lightning), but I only had everything unplugged for a few hours; until a few months back, when I had everything unplugged for two days, which meant that when I booted the Soekris machine, it got a new public ip address, so I ran the scripts to reconfigure ipf.conf and ipnat.conf (and ran reload using the rc.d scripts). I got some really odd network (nonfunctional) behavior, so I rebooted the soekris machine, and then it appeared to work. However, over time I noticed a few quirks: - I started having quite a few issues when establishing TCP connections. Seemingly randomly chosen connections would simply never really complete. - Skype was really flakey (suddenly messages will simply be "Pending", and can stay that way for up to five minutes or so before being sent. People trying to send messages to me say they are seeing the same thing. - Whenever I was logged in to PSN on my PS3, it would automatically log me out after a while (a few minutes to an hour or so). - Somewhat often when I send a mail, Thunderbird will complain that it can't save the the message to the Sent-folder (IMAP@Google); retrying a few times would make it succeed after a while. I started thinking there was something wrong at my ISP until I noticed another quirk: - From my main computer (on the switch on wm1) when I ssh to my media PC (on the switch on wm2), if I don't do anything with the ssh session (like type in it, or have it update (like showing top, rtorrent or something which keeps updating its display), the ssh session will simply hang (become non-responsive), after a very short time (I'd say roughly one minute or so). I became pretty frustrated with the behavior, and just to see if there was a quick fix, I simply shut everything down and restarted again. That fixed all the problems. No more odd connection problems, no more "Pending" messages in Skype, no more being signed out of PSN, no more Thunderbird problems, and I could establish an ssh session from my main PC to my media-PC, open up a prompt, then let it be opened for hours, even days, without being touched, and then when I try using it, it works (just as I would expect it to). So I was naturally annoyed that I never got to the bottom of it, but I was happy my network wasn't behaving weird any longer. Fast-forward a few weeks, and we had another warning about potential lightning/thunderstorms, so I unplugged everything, went away for a few days, came back, plugged everything in, got a new IP-address from my ISP, ran my scripts, restarted. Then the odd problems were back (odd connection problems, Skype problems, save Sent-mail problems, PSN signouts, ssh would hang). After a few days a friend of mine complained about the "Pending" problem, so I decided to try the "quick fix" again by simply shutting everything down and starting everything up again. But this time it didn't work. And I've tried it again, just to be sure. I should stress that the vast majority of things I do work. I can browse the network (albeit with a few connections acting up as described above), I can read mail (again, with a few connection problems). PSN is the only thing which feels very ... er.. reliably unreliable. Although I have very few samples, I'm pretty sure that all these issues are related (like I said, I had none of them for a long time, all of them appeared at the same time, they all went away at the same time, and they all came back at the same time), and because of the ssh-problem, I get a feeling it's a very local problem. Oh, btw, the ssh problem occurs also when I ssh from a system on the WiFi network to the media-PC, but it does not occur when I ssh to my Soekris router (that connection never dies). I'm at a loss. I've tried the very few things I can think of, like checking the routes, making sure that the generated ipf.conf and ipnat.conf look ok (and it all looks ok). And also: # sysctl -a | grep forward net.inet.ip.forwarding = 1 net.inet6.ip6.forwarding = 0 Any hints, tips, help is very welcome -- I'd like to figure out once and for all what is causing this. I pretty much suck at network administration, so if you're wondering if I tried running <useful network tool X>, then it's a safe bet that I haven't, because I never heard of it. :) -- Kind regards, Jan Danielsson
Attachment:
signature.asc
Description: OpenPGP digital signature