tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: fix gif(4)'s softint(9) contract violation [was Re: RFC: gif(4) MP-ify]



Hi, riastradh@n.o

Thank you for comments.

On 2016/01/23 1:29, Taylor R Campbell wrote:
>    Date: Fri, 22 Jan 2016 13:37:23 +0900
>    From: Kengo NAKAHARA <k-nakahara%iij.ad.jp@localhost>
> 
>    I implement this processing. Here is the patch.
>        http://netbsd.org/~knakahara/fix-gif-softint/fix-gif-softint.patch
> 
>    The patch includes reverting cvs kern_softint.c:r1.42. Furthermore, 
>    the patch fixes the race between encap[46]_lookup(rnh->rnh_matchaddr)
>    and encap_detach(rnh->rnh_deladdr) keeping packet processing path
>    lockless.
>    # I think radix tree must be required lock.
> 
>    However, this patch doesn't make gif(4) mp-ify yet. I will send gif(4)
>    mp-ify patch later.
> 
>    Could you comment this patch?
> 
> Your patch has two largely independent functional changes in it:
> 
> 1. fix the softint disestablish issue by adding a pause operation, and
> 2. make ip_encap MP-scalable by using a pserialized list instead of a
> giant-locked radix tree.
> 
> Part (1) can be done a little more easily, I think -- I've attached a
> compile-tested patch that does only part (1), by calling encap_detach
> before softint_disestablish.  It doesn't seem to require any special
> cooperation from gif_output or gifintr.  I probably overlooked
> something, though.

Hum...sorry, I think I don't understand your intent. Could you tell me
how encap_detach() stops softint_schedule() in gif_output()?
I think gif(4)'s transmission path is not related to ip_encap.c.

> Part (2) makes ip_encap scale well to many cores, but it stops
> ip_encap from scaling well to many tunnels, because it removes the
> radix tree.  Maybe that's OK -- maybe today, scaling to many cores is
> more important than scaling to many tunnels.  But there's a comment in
> the old code suggesting that scaling to many tunnels may be important:
> 
> /*
>  * The code will use radix table for tunnel lookup, for
>  * tunnels registered with encap_attach() with a addr/mask pair.
>  * Faster on machines with thousands of tunnel registerations (= interfaces).
>  ...
> 
> Do you have a plan for how to address scaling to many tunnels too?

# As you point out in the other mail, the radix tree does not help
# much to scale to many tunnels. However, it reduces processing per
# loop, so it helps to scale by some measures. I think it is required
# to fix the scaling to many tunnels issue anyway.

Yes. I plan to use hash table to search a match tunnel if possible.
As far as gif(4), I *think* the prio check is not required. I think
it is satisfied by psrc and pdst exact match like OpenBSD's gif(4).
If prio check is not required in fact, it can use hash table.


> One heavy-handed but easy way that we might keep the radix tree *and*
> scale to many cores would be to simply drop encapsulated packets while
> anyone is reconfiguring a tunnel and modifying the radix tree.
> 
> I think you could do something like the following.  I've also added a
> reference count to each tunnel in this sketch.  I did this because
> without the reference count, it's not clear to me how to ensure that
> the tunnel will continue to exist after the pserialize_read_exit.

I will review the codes closer and reply the result to the other mail.

BTW, I also want to reduce radix tree caller to simplify routing table
refactoring (and mp-ify future work).


Thanks,

-- 
//////////////////////////////////////////////////////////////////////
Internet Initiative Japan Inc.

Device Engineering Section,
Core Product Development Department,
Product Division,
Technology Unit

Kengo NAKAHARA <k-nakahara%iij.ad.jp@localhost>


Home | Main Index | Thread Index | Old Index