tech-net archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: hangs with awge(4) on pine64 rock64 board
On Sun, Jul 21, 2019 at 03:17:18PM +1000, matthew green wrote:
> hi folks..
>
>
> i've been debugging a hang on the rock64. it's fairly easy to
> trigger -- send a lot of data at it.
>
> from ddb i would usually see one cpu with an lwp, usually the
> idle lwp, fast lwp switched to softnet, and again fast switched
> to the softser lwp. it seemed to be a kernel lock issue as the
> kernel lock was held and at least one thread was waiting for
> it. i couldn't really tell what was up.
>
> i tried enabling NET_MPSAFE (which changes the behaviour of
> awge(4) / dwc_gmac.c, beyond the network stack.) that kernel
> ran for a lot longer, but ended up locking up again, this time
> the rt_lock was being waited upon. but again, i couldn't find
> where it was held or what context should be giving it up, though
> i did again think about arm's pic_dispatch() being the last
> lock and unlock of kernel_lock. then i realised that even with
> NET_MPSAFE, awge(4)'s frontends don't setup MPSAFE interrupts.
> with a kernel patched to do that under NET_MPSAFE i've had over
> 5 hours of heavy network access without a hang.
>
> i don't know what is the underlying issue here. it could be
> some network stack bug, it could be an awge/gmac bug, it could
> be an arm or arm64 bug..
>
> anyone have a clue where to investigate next? alternatively,
> how far off is NET_MPSAFE default? :)
It looks like something I fixed some time ago in the arm pmap:
http://mail-index.netbsd.org/source-changes/2019/04/23/msg105355.html
maybe arm64 has a similar issue.
--
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
NetBSD: 26 ans d'experience feront toujours la difference
--
Home |
Main Index |
Thread Index |
Old Index