Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NetBSD xen dom0 kernel panic



In article <49C01DF8.8020403%searchy.net@localhost>,
Frank de Bot  <netbsd%searchy.net@localhost> wrote:
>Christos Zoulas wrote:
>> On Mar 17,  1:54pm, netbsd%searchy.net@localhost (Frank) wrote:
>> -- Subject: Re: NetBSD xen dom0 kernel panic
>>
>> | Christos Zoulas wrote:
>> | > In article <49BF92B2.7020005%searchy.net@localhost>, Frank 
><netbsd%searchy.net@localhost> wrote:
>> | >   
>> | >> A few times my test NetBSD server running NetBSD 5.0_RC2 (amd64) 
>> | >> suddenly crashed. (every 2-3 days)
>> | >>
>> | >> At the moment of the crash dmesg is:
>> | >>
>> | >>     
>> | >
>> | > A LOCKDEBUG kernel will give you more information.
>> | >
>> | > christos
>> | >
>> | >   
>> | Ok, I'll build a kernel with that option. Are there any other things 
>> | which could be helpfull on debugging this?
>>
>> I usually build testing kernels with DIAGNOSTIC+DEBUG+LOCKDEBUG. I think
>> that the problem you are seeing is a recursive lock that would result in
>> deadlock (or unlocking and unlocked lock). The LOCKDEBUG kernel should
>> point out where the lock is currently held from (or where it was last
>> held and released). There are ddb commands to examine the locks, but
>> I don't remember them off hand right now.
>>
>> christos
>>   
>
>I booted up a kernel with the LOCKDEBUG option on. It paniced during 
>boot. Output:
>
>Mutex error: lockdebug_wantlock: acquiring sleep lock from interrupt context
>
>lock address : 0xffffffff80b20130 type     :     sleep/adaptive
>initialized  : 0xffffffff8043c60b
>shared holds :                  0 exclusive:                  0
>shares wanted:                  0 exclusive:                  0
>current cpu  :                  0 last held:                  0
>current lwp  : 0xffffa0002a3cbba0 last held: 000000000000000000
>last locked  : 0xffffffff8043b4ef unlocked : 0xffffffff8043b78c
>owner field  : 000000000000000000 wait/spin:                0/0
>
>Turnstile chain at 0xffffffff80b3c2c0.
>=> No active turnstile for this lock.

Good, I think that the that pools that are used in interrupt contexts,
should be using pool_cache_<foo> not pool_<foo> like the signal does.
Is that right? Can you try to make this change?

christos



Home | Main Index | Thread Index | Old Index