Subject: Re: sk(4) interrupt moderation timing fix, and sysctl support
To: Jonathan Stone <>
From: Bill Studenmund <>
List: tech-net
Date: 11/28/2005 19:06:33
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun, Nov 27, 2005 at 12:29:01PM -0800, Jonathan Stone wrote:
> In message <>Jeff Rizzo writes
> >On Thu, Nov 24, 2005 at 09:49:39AM -0800, Jeff Rizzo wrote:
> >> Jason Thorpe wrote:
> >>
> >> >
> >> > On Nov 23, 2005, at 8:10 PM, Jeff Rizzo wrote:
> >> >
> >> >> 1. Should I bother supporting a per-board value for the interrupt
> >> >> moderation
> >> >> timer?  In this patch, it's global, and only takes effect when
> >> >> sk_init()
> >> >> is called. (ifconfig down/up is a good way to trigger this)

It would be nice to eventually change this so that whenever the sysctl is=
changed, the hardware is updated. Actually, I think it's important. :-)

> >> > Seems like it should be per-instance rather than global, especially
> >> > if different revisions of the chip need to have different values.
> I have to disagree here. My eventual intent was to add support for
> dynamic auto-tuning of interrupt mitigation. For that, one really
> needs to apply *global* moderation, for stability reasons: either all
> devices stay at their current moderation level, or all go up a little,
> or all go down a little.
> The nonzero bge values were intended to be
> 	(1 + log_2 (Rx packets per interrupt)
> but that didn't quite work out with newer chips.
> For manual tweaking, I can sort-of see Jason's point, but I submit
> that, for systems with multiple NICs, having drivers map global
> "levels" with cross-device maning levels into chip-specific constants
> is a generally more-useful hook than per-device hooks; though the
> latter can express policies (e.g., one interface with fixed low
> latency, others with high throughput) that the former cannot.
> Nevertheless, I still advocate a global FSM which computes one of
> three states, "too hot/too cold/just right", as a function of idle
> CPU; and if the FSM computes "too hot" or "too cold" K times in a row,
> the FSM increases or decreases interrupt mitigation globally, just a
> tad per device, to match.

Wouldn't it also be possible for the FSM to be told to just update all the=
instances? Since we're talking policy, my first instinct is to have some=20
little daemon in userland running the FSM. In that case, we can give it a=
list of interfaces to watch and it can easily update N interfaces instead=
of one global one.

> (Once I put that way, it should be immediately apparent that having
> each driver do its own computation in isolation will yield an system
> that is prone to oscillations, if not worse.)
> >Here's a new patch - if this looks OK to folks, I'll commit it.
> Can you keep support for global hooks which go up or down one notch,
> even if they're not exposed to userland?

Actually, I'm going to vote for what's in this patch (per-device). Ok, I'm
assuming we get to the point where we auto-update the hardware on change
(I think that's "the right thing" to do). The problem I see with a global
value is that all "the appropriate" devices have to watch for changes, so=
that they propogate out. So they have to be told to watch (or do it=20
automatically), and the update code has to itterate over a list of=20
watchers to tell the watchers that there has been a change.

If we instead leave things per-device and have the FSM be given a list of
devices to update, each device need only react to its own value changing.
That's easy, as I understand the sysctl system. So we combine a simple=20
list walker in the FSM and simple change-reaction logic in the device=20
drivers, and the right thing just happens. :-)

So, Jonathan, the cool thing you're describing will work and also the same=
system will work well with folks who just want to manually tweak things.=20
Put another way, the FSM you're describing can just drop right on top of=20
this infrastructure when it's ready.

Take care,


Content-Type: application/pgp-signature
Content-Disposition: inline

Version: GnuPG v1.2.3 (NetBSD)