NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/50186: sparc memfault panic after 7.99.21 ARP changes



On Fri, Sep 4, 2015 at 6:23 PM, Christos Zoulas <christos%zoulas.com@localhost> wrote:
> On Sep 4,  6:10pm, ozaki-r%netbsd.org@localhost (Ryota Ozaki) wrote:
> -- Subject: Re: kern/50186: sparc memfault panic after 7.99.21 ARP changes
>
> | On Fri, Sep 4, 2015 at 5:40 PM, Christos Zoulas <christos%zoulas.com@localhost> wrote:
> | > On Sep 4,  1:50pm, ozaki-r%netbsd.org@localhost (Ryota Ozaki) wrote:
> | > -- Subject: Re: kern/50186: sparc memfault panic after 7.99.21 ARP changes
> | >
> | > | On Thu, Sep 3, 2015 at 8:38 PM, Christos Zoulas <christos%zoulas.com@localhost> wrote:
> | > | > I just crashed in arptimer() so there are more locking problems in the code. Can you document the locking discipline for la_rt and changing the lists?
> | > | >
> | > |
> | > | I'm sorry for the defect.
> | > |
> | > | An ARP cache list of an interface is protected by a rwlock
> | > | (IF_AFDATA_*LOCK) and each ARP cache is protected by a rwlock
> | > | and refernce counting (LLE_*LOCK). However, la_rt still needs
> | > | softnet_lock; if la_rt is accessed or modified without
> | > | softnet_lock, it's a bug. And I found a bug :( lltable_free
> | > | accesses la_rt but it's called without softnet_lock.
> | > |
> | > | Here is a patch:
> | > | http://www.netbsd.org/~ozaki-r/lltable_free-softnet_lock.diff
> | > | Could you try it?
> | > |
> | >
> | > Thanks, I am running with it now. Should we revert the KASSERT change
> | > too?
> |
> | Well, yes and no. Because KASSERT was actually wrong; la_rt can be NULL
> | at the point according to my investigation for PR 50184. So anyway we
> | have to get rid of it.
> |
> | I made a patch for the bug:
> | http://www.netbsd.org/~ozaki-r/fix-PR50184.take2.diff
> | which was for PR 50184. So reverting your commit and applying the patch
> | instead might be easy for me. Of course rebasing my patch on the HEAD
> | makes no difference though.
>
> Why don't you commit both of them?

I want to clarify they really fix the bug(s). I fail to reproduce the panic
on my machines. Do my softnet_lock patch fix your issue?

  ozaki-r


Home | Main Index | Thread Index | Old Index