Subject: kern/14090: "panic: lockmgr: locking against myself" with nullfs
To: None <gnats-bugs@gnats.netbsd.org>
From: None <apb@cequrux.com>
List: netbsd-bugs
Date: 09/28/2001 17:16:55
>Number:         14090
>Category:       kern
>Synopsis:       "panic: lockmgr: locking against myself" with nullfs
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Sep 28 08:21:00 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator:     Alan Barrett
>Release:        NetBSD-current 'Sun Sep 23 21:12:12 EDT 2001'
>Organization:
Not much
>Environment:
NetBSD/i386 1.5Y
Built from sources checked out from CVS with -D'Sun Sep 23 21:12:12 EDT 2001'

>Description:
This is a new problem in NetBSD-1.5Y.  It was not present in a kernel
built on 5 Sep 2001 from sources that were probably a few days older
than that.

In an environment that makes heavy use of nullfs and raid0 filesystems,
one of the "find" commands run from the daily cron jobs causes a panic.
I don't yet know for sure which find command was responsible, but I
suspect the find|xargs|sort pipeline in the check_devices section of
/etc/security.

Here's the panic message and a hand-transcribed backtrace (with function
arguments omitted):

   panic: lockmgr: locking against myself
   stopped in pid 28479 (find) at cpu_Debugger+0x4: leave
   db> t
   cpu_Debugger(...) +0x4
   panic(...) +0xad
   lockmgr(...) +0x591
   layer_lock(...) +0x44
   VOP_LOCK(...) +0x2e
   vn_lock(...) +0x5d
   getnewvnode(...) +0x122
   ffs_vget(...) +0x4f
   ufs_lookup(...) +0x9bd
   VOP_LOOKUP(...) +0x35
   lookup(...) +0x236
   namei(...) +0x2f1
   sys___lstat13(...) +0x4f
   syscall_plain(...) +0xa7
   db>

The only printable string that I was easily able to discover was the
second arg passed to lookup(), and that string was "ircsearch".  I
believe that two paths match that name: /r1a/USR-PKG/bin/ircsearch is a
directory on an ordinary FFS filesystem on a raid0 partition, and
/usr/pkg/bin/ircsearch is a nullfs image of the same underlying file.
The /etc/fstab entries for the /r1a and /usr/pkg filesystems are as
follows:

    /dev/raid1a     /r1a            ffs     rw,softdep      1 3
    /r1a/USR-PKG    /usr/pkg        null    rw              0 0

>How-To-Repeat:
>Fix:
1) Fix the locking bug.

2) It might be a good idea to add "-o -fstype null" to the set of
   exclusions in the find command.  There's no point in walking
   both the nullfs tree and the underlying tree.
>Release-Note:
>Audit-Trail:
>Unformatted: