NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/56170: NFS-related: panic: lock error: Mutex: mutex_vector_enter,543: locking against myself
The following reply was made to PR kern/56170; it has been noted by GNATS.
From: Christos Zoulas <christos%zoulas.com@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: kern-bug-people%netbsd.org@localhost,
gnats-admin%netbsd.org@localhost,
netbsd-bugs%netbsd.org@localhost
Subject: Re: kern/56170: NFS-related: panic: lock error: Mutex:
mutex_vector_enter,543: locking against myself
Date: Fri, 14 May 2021 17:24:29 -0400
--Apple-Mail=_8E4D82D2-4B57-482B-9A3A-6F9F00CB463A
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
charset=us-ascii
https://www.netbsd.org/~christos/nfs.diff for a disgusting hack I am =
using to avoid this.
christos
> On May 14, 2021, at 4:45 PM, Greg A. Woods <woods%planix.ca@localhost> wrote:
>=20
>> Number: 56170
>> Category: kern
>> Synopsis: NFS+gcc-ASAN-related: panic: lock error: Mutex: =
mutex_vector_enter,543: locking against myself
>> Confidential: no
>> Severity: serious
>> Priority: medium
>> Responsible: kern-bug-people
>> State: open
>> Class: sw-bug
>> Submitter-Id: net
>> Arrival-Date: Fri May 14 20:45:00 +0000 2021
>> Originator: Greg A. Woods
>> Release: NetBSD 9.99.81
>> Organization:
> Planix, Inc.; Kelowna, BC; Canada
>> Environment:
> System: NetBSD xentastic 9.99.81 NetBSD 9.99.81 (XEN3_DOM0) #16: Thu =
May 6 13:40:07 PDT 2021 =
woods@xentastic:/build/woods/xentastic/current-amd64-amd64-obj/build/src/s=
ys/arch/amd64/compile/XEN3_DOM0 amd64
> Architecture: x86_64
> Machine: amd64
>> Description:
>=20
> I've been trying out the GCC sanitizers on one of my recently
> favourite little projects, and I've found I can reliably crash
> NetBSD with one of the tests, when it is compiled with
> USE_ASAN=3Dyes, at least when it is run with $PWD on an NFS =
mount.
>=20
> Here is the console output from an example crash:
>=20
>=20
> [ 663.0426878] Mutex error: mutex_vector_enter,543: locking against =
myself
>=20
> [ 663.0426878] lock address : 0xffffc8800b962b00
> [ 663.0426878] current cpu : 1
> [ 663.0426878] current lwp : 0xffffc8800b9db1c0
> [ 663.0426878] owner field : 0xffffc8800b9db1c0 wait/spin: =
0/0
>=20
> [ 663.0426878] panic: lock error: Mutex: mutex_vector_enter,543: =
locking against myself: lock 0xffffc8800b00b9db1c0
> [ 663.0426878] cpu1: Begin traceback...
> [ 663.0426878] vpanic() at netbsd:vpanic+0x14a
> [ 663.0426878] snprintf() at netbsd:snprintf
> [ 663.0426878] lockdebug_abort() at netbsd:lockdebug_abort+0xcd
> [ 663.0426878] mutex_vector_enter() at netbsd:mutex_vector_enter+0x406
> [ 663.0426878] sigpending1() at netbsd:sigpending1+0x24
> [ 663.0527222] nfs_sigintr() at netbsd:nfs_sigintr+0x2c
> [ 663.0527222] nfs_rcvlock() at netbsd:nfs_rcvlock+0xaf
> [ 663.0527222] nfs_request() at netbsd:nfs_request+0x40d
> [ 663.0527222] nfs_access() at netbsd:nfs_access+0x1d4
> [ 663.0527222] VOP_ACCESS() at netbsd:VOP_ACCESS+0x55
> [ 663.0527222] getcwd_common() at netbsd:getcwd_common+0x251
> [ 663.0527222] vnode_to_path() at netbsd:vnode_to_path+0xbb
> [ 663.0527222] sysctl_vmproc() at netbsd:sysctl_vmproc+0x6cd
> [ 663.0527222] sysctl_dispatch() at netbsd:sysctl_dispatch+0xa5
> [ 663.0527222] sys___sysctl() at netbsd:sys___sysctl+0xc5
> [ 663.0527222] syscall() at netbsd:syscall+0x9c
> [ 663.0527222] --- syscall (number 202) ---
> [ 663.0527222] netbsd:syscall+0x9c:
> [ 663.0527222] cpu1: End traceback...
> [ 663.0527222] fatal breakpoint trap in supervisor mode
> [ 663.0527222] trap type 1 code 0 rip 0xffffffff8023e93d cs 0xe030 =
rflags 0x202 cr2 0x7f7ff6892ce0 ilevel
>=20
> [ 663.0527222] curlwp 0xffffc8800b9db1c0 pid 6987.6987 lowest kstack =
0xffffc880ef49a2c0
> Stopped in pid 6987.6987 (yajl_test) at netbsd:breakpoint+0x5: leave
> ds e650
> es e600
> fs e640
> gs 10
> rdi 0
> rsi 1
> rbp ffffc880ef49e640
> rbx ffffffff80ed2f50 mutex_adaptive_lockops
> rdx 2
> rcx 0
> rax 0
> r8 ffffffff80ed2f50 mutex_adaptive_lockops
> r9 1
> r10 0
> r11 fffffffe
> r12 104
> r13 ffffffff80d43960 ostype+0xa6448
> r14 ffffc880ef49e688
> r15 ffffffff80d3c46b ostype+0x9ef53
> rip ffffffff8023e93d breakpoint+0x5
> cs e030
> rflags 202
> rsp ffffc880ef49e640
> ss e02b
> netbsd:breakpoint+0x5: leave
> db{1}> (XEN) [2021-05-14 18:09:45.682] Watchdog timer fired for domain =
0
> (XEN) [2021-05-14 18:09:45.682] Hardware Dom0 shutdown: watchdog =
rebooting machine
>=20
> (I guess ddb.onpanic=3D1 and the Xen watchdog aren't very useful
> together!)
>=20
>=20
>> How-To-Repeat:
>=20
> I don't yet have an isolated example test, but running the
> regression tests in my robohack/yajl project, and in particular
> the "ap_eof_str" test, with USE_ASAN=3Dyes and with the source =
and
> build on an NFS mount (which I'm only guessing about because of
> the nfs_*() calls in the kernel stack backtrace), has reliably
> reproduced this crash for me:
>=20
> $ cd /some/NFS/mountpoint
> $ git clone https://github.com/robohack/yajl
> $ cd yajl
> $ mkdir build
> $ MAKEOBJDIRPREFIX=3D$(/bin/pwd)/build make regress USE_ASAN=3Dyes=
MKDOC=3Dno
>=20
> If I understand correctly the system call involved here is
> sysctl(2), and that there's something to do with proc too, but
> I'm quite unfamiliar with ASAN runtime internals so I don't know
> what it's doing to cause this, especially since a couple of
> other tests have already run when this one crashes. I do know
> that ASAN will check to make sure ASLR is not enabled, and it
> will also mmap() something somewhere really high up and it fails
> unless you do "ulimit -v unlimited" first.
>=20
> If necessary I can try in a domU, or disable the Xen watchdog
> for the dom0 (as otherwise I only have 20 seconds before the
> reboot!), and try the crash again and do more DDB digging if
> someone can guide me along. And/Or I can change what's in
> ddb.commandonenter too...
>=20
>> Fix:
>=20
>> Unformatted:
> 2021-03-10T23:08:13Z
--Apple-Mail=_8E4D82D2-4B57-482B-9A3A-6F9F00CB463A
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
filename=signature.asc
Content-Type: application/pgp-signature;
name=signature.asc
Content-Description: Message signed with OpenPGP
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org
iF0EARECAB0WIQS+BJlbqPkO0MDBdsRxESqxbLM7OgUCYJ7qjQAKCRBxESqxbLM7
Ot/uAJ9UtJFkEo+iV50fRvSqZLuVg1TJ+wCdHCLtwUtGSLYt/9ufHziixqEcBGc=
=Mxp+
-----END PGP SIGNATURE-----
--Apple-Mail=_8E4D82D2-4B57-482B-9A3A-6F9F00CB463A--
Home |
Main Index |
Thread Index |
Old Index