NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/50186: sparc memfault panic after 7.99.21 ARP changes



The following reply was made to PR kern/50186; it has been noted by GNATS.

From: Ryota Ozaki <ozaki-r%netbsd.org@localhost>
To: Christos Zoulas <christos%zoulas.com@localhost>
Cc: "gnats-bugs%NetBSD.org@localhost" <gnats-bugs%netbsd.org@localhost>, 
	"kern-bug-people%netbsd.org@localhost" <kern-bug-people%netbsd.org@localhost>, 
	"gnats-admin%netbsd.org@localhost" <gnats-admin%netbsd.org@localhost>, "netbsd-bugs%netbsd.org@localhost" <netbsd-bugs%netbsd.org@localhost>, 
	"jdbaker%mylinuxisp.com@localhost" <jdbaker%mylinuxisp.com@localhost>
Subject: Re: kern/50186: sparc memfault panic after 7.99.21 ARP changes
Date: Fri, 4 Sep 2015 18:30:40 +0900

 On Fri, Sep 4, 2015 at 6:23 PM, Christos Zoulas <christos%zoulas.com@localhost> wrote:
 > On Sep 4,  6:10pm, ozaki-r%netbsd.org@localhost (Ryota Ozaki) wrote:
 > -- Subject: Re: kern/50186: sparc memfault panic after 7.99.21 ARP changes
 >
 > | On Fri, Sep 4, 2015 at 5:40 PM, Christos Zoulas <christos%zoulas.com@localhost> wrote:
 > | > On Sep 4,  1:50pm, ozaki-r%netbsd.org@localhost (Ryota Ozaki) wrote:
 > | > -- Subject: Re: kern/50186: sparc memfault panic after 7.99.21 ARP changes
 > | >
 > | > | On Thu, Sep 3, 2015 at 8:38 PM, Christos Zoulas <christos%zoulas.com@localhost> wrote:
 > | > | > I just crashed in arptimer() so there are more locking problems in the code. Can you document the locking discipline for la_rt and changing the lists?
 > | > | >
 > | > |
 > | > | I'm sorry for the defect.
 > | > |
 > | > | An ARP cache list of an interface is protected by a rwlock
 > | > | (IF_AFDATA_*LOCK) and each ARP cache is protected by a rwlock
 > | > | and refernce counting (LLE_*LOCK). However, la_rt still needs
 > | > | softnet_lock; if la_rt is accessed or modified without
 > | > | softnet_lock, it's a bug. And I found a bug :( lltable_free
 > | > | accesses la_rt but it's called without softnet_lock.
 > | > |
 > | > | Here is a patch:
 > | > | http://www.netbsd.org/~ozaki-r/lltable_free-softnet_lock.diff
 > | > | Could you try it?
 > | > |
 > | >
 > | > Thanks, I am running with it now. Should we revert the KASSERT change
 > | > too?
 > |
 > | Well, yes and no. Because KASSERT was actually wrong; la_rt can be NULL
 > | at the point according to my investigation for PR 50184. So anyway we
 > | have to get rid of it.
 > |
 > | I made a patch for the bug:
 > | http://www.netbsd.org/~ozaki-r/fix-PR50184.take2.diff
 > | which was for PR 50184. So reverting your commit and applying the patch
 > | instead might be easy for me. Of course rebasing my patch on the HEAD
 > | makes no difference though.
 >
 > Why don't you commit both of them?
 
 I want to clarify they really fix the bug(s). I fail to reproduce the panic
 on my machines. Do my softnet_lock patch fix your issue?
 
   ozaki-r
 


Home | Main Index | Thread Index | Old Index