Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: 5.99.42/i386 crash (backtrace + core available)



(This discussion was temporarily held privately.)

David Holland <dholland%netbsd.org@localhost> wrote:
>  > It very much seems to be the same crash; the backtrace is the same.
>  > After reproducing the crash on the console, I can now also say it's
>  > a "locking against myself" lock error indeed. The kernel is quite
>  > recent, so I don't think it's yamt's reverted changes.
>
> Probably advisable to check that explicitly. (Easy to do though; the
> broken version is 1.126 of sys/kern/vfs_lookup.c.)
>
>  > If it helps, I obtained a coredump produced with a DIAGNOSTIC+LOCKDEBUG
>  > kernel. I would need further instructions what to provide from
>  > that, though.
>
> The first thing to check is whether the vnode it's tripping over is
> the same one it's trying to do lookup on. If it is, there's probably a
> refcount bug somewhere in the call chain, and knowing it's there we/I
> can probably find it. If it isn't... if it's some totally random
> vnode, it's probably yet another race condition in vnode reclaim and
> I'm not sure where to look. (I'm not really up on the vnode lifecycle
> stuff yet.) However, if one of the vnodes is the nullfs projection of
> the other, then it's probably fallout from Juergen's layer-locking
> cleanup. Unfortunately I'm not sure how to check for that easily: the
> layer_node inside the nullfs private vnode data contains a
> layer_lowervp that points to the corresponding vnode from the
> underlying fs, but extracting that with ddb may be somewhat painful.
>
> Another good thing to check is where the offending lock was originally
> acquired; the code address appears in the LOCKDEBUG panic message so
> it should just require translating that to a source location.
>
> Also, it's probably a good idea to take this back to the mailing list;
> more eyes there.
>
>  > I can now reliably reproduce the crash by grepping over the pkgsrc
>  > tree while pbulk is in the "scan" phase.

I tried to get useful information from ddb, but ddb itself gets
uvm faults as soon as I ask it for something interesting like a
backtrace or information on locks, vnodes, etc., so unfortunately,
I don't quite know where to go from here...

Thanks,

Dennis den Brok


Home | Main Index | Thread Index | Old Index