NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/44418 (FAST_IPSEC and if_wm kernel panic - may affect the whole network stack)
The following reply was made to PR kern/44418; it has been noted by GNATS.
From: Wolfgang Stukenbrock <Wolfgang.Stukenbrock%nagler-company.com@localhost>
To: M.Drochner%fz-juelich.de@localhost
Cc: Wolfgang Stukenbrock <Wolfgang.Stukenbrock%nagler-company.com@localhost>,
gnats-bugs%NetBSD.org@localhost, kern-bug-people%NetBSD.org@localhost,
netbsd-bugs%NetBSD.org@localhost, gnats-admin%NetBSD.org@localhost
Subject: Re: kern/44418 (FAST_IPSEC and if_wm kernel panic - may affect the
whole network stack)
Date: Mon, 14 Feb 2011 17:38:09 +0100
Hi,
sorry I cannot post it, because I haven't kept it.
I cannot say if it was a real interrupt or a soft interrupt.
The stack in DDB was some like this:
- opencrypto ret-thread starts processing
- I've added a lock to softnet_lock there, but does not increase SPL-Level
- An interrupt was there next and it panics in mutex_enter(softnet_lock).
I think it was ipintr(), but I'm not shure anymore - sorry
It had something to do with incomming packet processing.
And the panic was triggered by an ASSERT as far as I remember - not a
"normal" crash due to bad pointers or so...
After adding splsoftnet() (and getting the KERNEL_LOCK) to the path in
opencrypto ret-thread, the problem disapears. That's why I remember that
I must be carefull with mutex_enter() calls and softnet_lock may not be
locked mutliple times on different paths on the same CPU as the
KERNEL_LOCK may be.
I haven't digged deeper into that - simply assumed that the
mutex_enter/exit() stuff behaves like a posix-mutex - will also fail if
locked twice in the "correct" setup.
Best regards
W. Stukenbrock
Matthias Drochner wrote:
> Wolfgang.Stukenbrock%nagler-company.com@localhost said:
>
>>During testing I've got a panic when a CPU tries to get the mutex
>>twice.
>>
>
> OK, so I've reversed the locking order.
>
>
>>And if the spl-level does not lock out network interrupts, this may
>>happen. Even it the whole is very small ...
>>
>
> This is still somewhat strange... The mutex is an adaptive one,
> which cannot be taken in interrupt handlers at all. Only in
> the softint handler. As I understand it, it is also OK for an
> adaptive mutex to be attempted to be taken a second time,
> even if held by the same CPU (just not by the same thread).
> I can only suspect that the panic might be related to the
> limited thread context of the softint handler.
> If this is the case it would be an unnecessary limitation.
>
> Can you post the exact panic message and traceback,
> just to help to understand the issue?
>
> best regards
> Matthias
>
>
>
> ------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher
> Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
> Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Prof. Dr. Sebastian M. Schmidt
> ------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------
>
--
Dr. Nagler & Company GmbH
Hauptstraße 9
92253 Schnaittenbach
Tel. +49 9622/71 97-42
Fax +49 9622/71 97-50
Wolfgang.Stukenbrock%nagler-company.com@localhost
http://www.nagler-company.com
Hauptsitz: Schnaittenbach
Handelregister: Amberg HRB
Gerichtsstand: Amberg
Steuernummer: 201/118/51825
USt.-ID-Nummer: DE 273143997
Geschäftsführer: Dr. Martin Nagler, Dr. Dr. Karl-Kuno Kunze
Home |
Main Index |
Thread Index |
Old Index