Re: 5.99.42/i386 crash (backtrace + core available)

To: Dennis den Brok <denbrok%uni-bonn.de@localhost>
Subject: Re: 5.99.42/i386 crash (backtrace + core available)
From: haad <haaaad%gmail.com@localhost>
Date: Fri, 7 Jan 2011 01:59:26 +0100

On Fri, Jan 7, 2011 at 1:50 AM, haad <haaaad%gmail.com@localhost> wrote:
> On Fri, Jan 7, 2011 at 12:35 AM, Dennis den Brok 
> <denbrok%uni-bonn.de@localhost> wrote:
>> (This discussion was temporarily held privately.)
>>
>> David Holland <dholland%netbsd.org@localhost> wrote:
>>> Â> It very much seems to be the same crash; the backtrace is the same.
>>> Â> After reproducing the crash on the console, I can now also say it's
>>> Â> a "locking against myself" lock error indeed. The kernel is quite
>>> Â> recent, so I don't think it's yamt's reverted changes.
>>>
>>> Probably advisable to check that explicitly. (Easy to do though; the
>>> broken version is 1.126 of sys/kern/vfs_lookup.c.)
>>>
>>> Â> If it helps, I obtained a coredump produced with a DIAGNOSTIC+LOCKDEBUG
>>> Â> kernel. I would need further instructions what to provide from
>>> Â> that, though.
>>>
>>> The first thing to check is whether the vnode it's tripping over is
>>> the same one it's trying to do lookup on. If it is, there's probably a
>>> refcount bug somewhere in the call chain, and knowing it's there we/I
>>> can probably find it. If it isn't... if it's some totally random
>>> vnode, it's probably yet another race condition in vnode reclaim and
>>> I'm not sure where to look. (I'm not really up on the vnode lifecycle
>>> stuff yet.) However, if one of the vnodes is the nullfs projection of
>>> the other, then it's probably fallout from Juergen's layer-locking
>>> cleanup. Unfortunately I'm not sure how to check for that easily: the
>>> layer_node inside the nullfs private vnode data contains a
>>> layer_lowervp that points to the corresponding vnode from the
>>> underlying fs, but extracting that with ddb may be somewhat painful.
>>>
>>> Another good thing to check is where the offending lock was originally
>>> acquired; the code address appears in the LOCKDEBUG panic message so
>>> it should just require translating that to a source location.
>>>
>>> Also, it's probably a good idea to take this back to the mailing list;
>>> more eyes there.
>>>
>>> Â> I can now reliably reproduce the crash by grepping over the pkgsrc
>>> Â> tree while pbulk is in the "scan" phase.
>>
>> I tried to get useful information from ddb, but ddb itself gets
>> uvm faults as soon as I ask it for something interesting like a
>> backtrace or information on locks, vnodes, etc., so unfortunately,
>> I don't quite know where to go from here...
>
>
> Maybe this can help you ?

sorry this is working link

http://wiki.netbsd.org/users/haad/ddb_howto/



-- 


Regards.

Adam

Follow-Ups:
- Re: 5.99.42/i386 crash (backtrace + core available)
  - From: Andreas Gustafsson

References:
- 5.99.42/i386 crash (backtrace + core available)
  - From: Dennis den Brok
- Re: 5.99.42/i386 crash (backtrace + core available)
  - From: Dennis den Brok
- Re: 5.99.42/i386 crash (backtrace + core available)
  - From: haad

Prev by Date: Re: 5.99.42/i386 crash (backtrace + core available)
Next by Date: NTP requires IPv6 again...
Previous by Thread: Re: 5.99.42/i386 crash (backtrace + core available)
Next by Thread: Re: 5.99.42/i386 crash (backtrace + core available)
Indexes:

Home | Main Index | Thread Index | Old Index