Subject: Re: Is this a scheduler issue?
To: Jeff Rizzo <riz@redcrowgroup.com>
From: Steven M. Bellovin <smb@cs.columbia.edu>
List: tech-kern
Date: 07/14/2006 18:09:59
On Thu, 13 Jul 2006 21:41:14 -0700, Jeff Rizzo <riz@redcrowgroup.com>
wrote:

> Hi-
> 
> I'm currently working on a project in which I'm using ICMP to measure
> RTT across a switch fabric over long periods of time; currently sending
> about 50 packets per minute between pairs of hosts.  (In the production
> deployment, there will be one probe per switch, to look for inter-switch
> issues)  I'd really like to use NetBSD for this, as I've got a lovely
> build/distribution setup for this that will greatly ease the deployment
> of new or replacement systems.   However, the userland portion of the
> ping setup periodically shows high RTT spikes - 5-6 times a day, the max
> RTT jumps from ~0.5ms to as high as 25 ms. (That is, 5 or 6 packets per
> day exhibit this)
> 
> Two (identical hardware - soekris net4801) Linux boxes pinging each
> other don't exhibit this behavior; neither does a Linux box pinging a
> NetBSD host.   nice -20 helps a little, but not much - for whatever
> reason, a few times a day (irregular intervals, and I've employed static
> ARP to remove that variable) I get these spikes, and I would very much
> like not to.  :)
> 
> Does anyone have suggestions for something I can do to improve the
> situation?  I understand that NetBSD doesn't have any realtime
> capability - but I was hoping to get "close enough", as these boxes
> aren't doing anything else.  I've already got my build framework set up,
> and switching to another OS (especially if that's Linux) will be a
> _major_ hassle...  I'm willing to render the system sub-optimal for
> "other" work.
> 
> This is 3.0_STABLE;  though I have verified that a -current (as of
> yesterday) kernel exhibits the same behaviour...
> 
Just for fun, try disabling cron and see what happens.  And do you have
any other daemons running? 


		--Steven M. Bellovin, http://www.cs.columbia.edu/~smb