tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: if_txtimer API to replace (*if_watchdog)()




> On Jan 23, 2020, at 9:38 PM, Taylor R Campbell <campbell+netbsd-tech-net%mumble.net@localhost> wrote:
> 
> I don't really understand why you're proposing a new API.  But let me
> make sure I understand the context first.
> 
> When we submit a packet to a hardware queue for tx, we want to wait up
> to n seconds before giving up -- meaning that we report the packet
> dropped (if_oerrors++), reset the device, free the packet, recycle
> queue entries, whatever.

That's the basic idea, yes.

> Right now, we have a mechanism if_watchdog (a.k.a. if_slowtimo) which
> works as follows, for drivers that use it, like wpi(4):  Every second,
> if ifp->if_timer is nonzero, we decrement it, and if it becomes zero,
> we call ifp->if_watchdog -- all under the kernel lock.

Note: The WiFi drivers are a little weird in that net80211 kind of abuses what if_watchdog was originally meant for.

> - Some drivers like sk(4) set ifp->if_timer = 5 and use if_watchdog to
>  give up after 5sec.
> 
> - Some drivers like wpi(4) set ifp->if_timer = 1, sc->sc_tx_timer = 5,
>  and use if_watchdog to count down sc->sc_tx_timer every second.
>  That way they can give up after 5sec but also do other housekeeping
>  like ieee80211_watchdog every second.

Again, the WiFi drivers are abusing the historical use of if_watchdog.

> - Some drivers like wm(4) use a callout instead of if_watchdog.

Originally, wm(4) used if_watchdog / if_timer like ~everyone else.  But many drivers also grew their own independent 1-second timer used for link maintenance (I'm fairly familiar with the original behavior and intent of both of these things, since I am the original author of wm(4) and also wrote our MII layer, which is the origins of the 1-second link maintenance timer that many drivers employ).  The wm(4) driver was one such driver that used the "tick" callout model for link maintenance (to call mii_tick()).

> - Some drivers like wm(4) maintain one timer for each tx queue.
> 
> The _main_ problem you're trying to solve, as I understand it, is that
> management of ifp->if_timer -- in net/if.c but also scattered
> throughout drivers all over the tree -- is open-coded and relies on
> the kernel lock, which you would like to avoid (but please correct me
> if I have misunderstood):

There are a 3 basic problems:

1- if_timer relies on the KERNEL_LOCK.

2- if_timer does not have a way to model multiple queues.

3- There should be one basic way to modeling the transmit watchdog timer, rather than N.  (In the examples you cited above, N is definitely > 1.)

>>> On Jan 20, 2020, at 2:48 PM, Jason Thorpe <thorpej%me.com@localhost> wrote:
>>> 
>>> 2- It's not easy to make MP-safe because it relies on modifying
>>> the ifnet structure periodically outside of the normal locking
>>> mechanisms.
> 
> Presumably we could add a new lock for access to if_timer, and that
> would solve this problem.  But it seems to me it would be even simpler
> to just use a callout to solve this:

So, yes, one could just use a callout to solve this universally; I initially considered it.  That will be a bit more work to adopt inside drivers, but that's not really a showstopper.  But this proposal isn't borne simply out of ease-of-conversion.

Callouts are a bit more expensive... to arm them or introspect them requires taking an additional lock **shared by other callouts** that could be a source of lock contention; my proposal has the txtimer protected by the txq lock that would already be held in a NET_MPSAFE driver, and arming is just a 64-bit relaxed atomic load (yes, I know this would require a slightly more expensive operation on 32-bit machines), an unprotected 64-bit store, and an unprotected store of a bool (these unprotected stores are actually protected by the txq lock; I call them "unprotected" because they themselves don't requite atomicity or ordering). This could make a difference when you're trying to blast out packets at 40Gb/s.

Callouts also have a larger memory footprint ... not hugely so, mind you, but each one is 10 pointers large; my proposal uses a bit less memory than callouts... nearly a wash if you consider the fields that can be removed from struct ifnet when the conversion of every driver is complete. (Though, I do concede that for drivers that do not already use a periodic timer, it would end up using about the same amount of memory as a plain callout, once you account for the fields that can be removed from struct ifnet).  This memory footprint isn't going to shatter the earth... but it does potentially mean more cache lines occupied.

Tx transmit timers have no need for precision ... 1 second granularity is totally sufficient, and many drivers are already using a 1-second tick timer for link maintenance / stats updating; my proposal piggy-backs on that.  The cost is that you do some work once a second.

> - Where a driver currently sets ifp->if_timer = 5, it can instead
>  callout_schedule(sc->sc_ch, 5).
> 
> - Where a driver currently sets ifp->if_timer = 1 and also uses
>  sc->sc_tx_timer, it can instead callout_schedule(sc->sc_ch, 1) and
>  continue to use sc->sc_tx_timer under its own lock.
> 
> - Where a driver currently sets ifp->if_timer = 0, it can
>  callout_stop(sc->sc_ch), provided, of course, that MP-safe drivers
>  also callout_halt(sc->sc_ch) before callout_destroy(sc->sc_ch) in
>  detach and before anything else relies on the callout not to be
>  running.
> 
> - Where a driver currently queries ifp->if_timer == 0 to decide
>  whether to reschedule it, it can do (under the driver lock):
> 
> 	if (!callout_pending(sc->sc_ch) && !callout_invoking(sc->sc_ch))
> 		callout_schedule(sc->sc_ch, ...);
> 
> Why do we need anything more than a callout?
> 
> 
> You stated a second goal first:
> 
>>> 1- It does not have any way to represent multiple hardware-managed
>>> transmit queues.
> 
> But I don't really see why adding a new txtimer API helps any more
> than a callout.  Does it actually reduce any duplicated logic that
> could meaningfully be shared between, e.g., wm(4) and another
> interface with multiple tx queues?

You are not wrong that the entire problem could be addressed by using callouts everywhere.

> I'm especially skeptical of an API that, by dynamic conditionals,
> may combine
> 
> (a) a simple countdown timer (if_txtimer),
> and

if_txtimer is not a countdown timer.  When the timer is armed, it take a timestamp, and takes another timestamp and compares them when the "is this expired??" question is asked.  It's specifically designed to let someone else do the asking of that question.

> (b) a replica of the callout API using dynamic allocation but with an
> interlock determined up front and other slight changes and that has an
> embedded simple countdown timer with a boolean indicating that it's
> actually the more complex thing (if_txtimer_callout),

The intent of if_txtimer_callout is to essentially provide a the callout-backed "tick" facility to drive if_txtimer to drivers that wouldn't otherwise need that facility for their own purposes.  It's a set of convenience functions to reduce code duplication.

> without common logic that is meaningfully agnostic to which case it's
> working with.

if_txtimer_arm(), if_txtimer_disarm(), if_txtimer_is_armed(), if_txtimer_is_expired(), and if_txtimer_is_expired_explicit() work exactly the same for both the if_txtimer and if_txtimer_callout cases.  The users of if_txtimer_callout don't generally need to call if_txtimer_is_expired(), however, because that's done by a convenience function.  Every driver could use if_txtimer_callout, but I chose to let drivers that already have their own tick callout piggy-back on that.

>  For example, will any driver call if_txtimer_start
> without actually knowing up front whether it needs to use the callout?
> Will any driver call if_txtimer_stop in a code path that would not be
> adequately served either by if_txtimer_disarm, or by an unconditional
> if_txtimer_disarm & callout_stop?

The rules are pretty simple:

-- If the driver already provides its own 1-second "tick" callout, it uses that, and doesn't bother with if_txtimer_start() or if_txtimer_{stop,halt}().

-- If the driver does not already provide its own 1-second "tick" callout, it uses if_txtimer_start() and if_txtimer_{stop,halt}().

if_txtimer_{stop,halt}() internally also does a if_txtimer_disarm(), but that's to avoid having to call two different function in the if_stop routine (if_txtimer_disarm() followed by if_txtimer_stop()).

-- thorpej



Home | Main Index | Thread Index | Old Index