NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/48733: deadlock in if_output() with interrupt on KERNEL_LOCK
The following reply was made to PR kern/48733; it has been noted by GNATS.
From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: kern-bug-people%NetBSD.org@localhost, gnats-admin%NetBSD.org@localhost,
netbsd-bugs%NetBSD.org@localhost,
Wolfgang.Stukenbrock%nagler-company.com@localhost
Subject: Re: kern/48733: deadlock in if_output() with interrupt on KERNEL_LOCK
Date: Mon, 12 May 2014 14:23:46 +0200
On Fri, May 09, 2014 at 12:20:00PM +0000, Wolfgang Stukenbrock wrote:
> [...]
> db{0}> bt/a fffffe822f73f420
> trace: pid 0 lid 3 at 0xfffffe810e967760
> ether_output() at netbsd:ether_output+0x2b6
> ip_output() at netbsd:ip_output+0xa8f
> tcp_output() at netbsd:tcp_output+0x1698
> tcp_input() at netbsd:tcp_input+0x15d9
> ip_input() at netbsd:ip_input+0x3ef
> ipintr() at netbsd:ipintr+0x109
> softint_dispatch() at netbsd:softint_dispatch+0xd9
> DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xfffffe810e967d70
> Xsoftintr() at netbsd:Xsoftintr+0x4f
> --- interrupt ---
> 0:
>
> That is the part that is gooing to send a packet. I see the printout in
> ip_output prior calling 'ifp->if_output()' - not the one behind.
> The location pointed to by the backtrace in ether_output() is the call
> to "return ifq_enqueue(...)". I also see the printout I've added in
> front of this call, but not the one behind.
> In ifq_enqueue() I see the output of the call to 'ifp->if_start' - the
> wm-driver - in this routine and the printout in front of the splx(s) at
> the end of the routine - not the printout behind it.
> This is the localtion where the deadlock happens while processing other
> interrupts in Xspllower.
> This always looks the same ....
ether_output() is called with the KERNEL_LOCK held, so at this point cpu0
already owns KERNEL_LOCK, it won't spin trying to grab it again.
You can confirm this by printing curcpu()->ci_biglock_count.
Did you try a kernel with options LOCKDEBUG ?
What's possible here is a loop trying to process the same interrupt
forever.
>
>
>
> db{0}> bt/a fffffe822f736440
> trace: pid 0 lid 6 at 0xfffffe810e9739c8
> breakpoint() at netbsd:breakpoint+0x5
> comintr() at netbsd:comintr+0x518
> Xintr_ioapic_edge1() at netbsd:Xintr_ioapic_edge1+0xea
> --- interrupt ---
> bus_space_read_4() at netbsd:bus_space_read_4+0xa
> intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x3b
> Xintr_ioapic_level6() at netbsd:Xintr_ioapic_level6+0xf2
> --- interrupt ---
> Xspllower() at netbsd:Xspllower+0xe
> DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xfffffe810e973d70
> Xsoftintr() at netbsd:Xsoftintr+0x4f
> --- interrupt ---
> 0:
>
>
> Hmmm - not shure about it ...
> It looks like that during processing one pending interrupt in Xspllower
> at the end of that routine an interrupt came im that takes the
> KERNEL_LOCK in intr_biglock_wrapper() again and do what? Hangup in
> bus_space_read_4() ???? Busy-loop for whatever reason in that interrupt
> and the location where the DDB-enter occures in bus_space_read_4() is
> just random ????
> The comintr looks like the break-interrupt on the serial console of the
> system to enter DDB to me.
it is.
> Any idea to find out what interrupt routine it is???
dmesg could point to the problem; the interrupt we're looking for is
level-triggered on pin 6 (so maybe "irq 6")
--
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
NetBSD: 26 ans d'experience feront toujours la difference
--
Home |
Main Index |
Thread Index |
Old Index