NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: pkg/54192: lang/rust build error



>  Here's a backtrace of the problem.
>  (Rust 1.36, netbsd-current as of ~october 2019)
>
>  Reading symbols from /usr/pkg/bin/cargo...
>  [New process 1]
>  Core was generated by `cargo'.
>  Program terminated with signal SIGABRT, Aborted.
>  #0  0x00007f7f8680d09a in _lwp_kill () from /usr/libexec/ld.elf_so
>  (gdb) bt
>  #0  0x00007f7f8680d09a in _lwp_kill () from /usr/libexec/ld.elf_so
>  #1  0x00007f7f8680cf09 in abort () from /usr/libexec/ld.elf_so
>  #2  0x00007f7f86801616 in _rtld_shared_enter () from /usr/libexec/ld.elf_so
>  #3  0x00007f7f86800b91 in _rtld_bind () from /usr/libexec/ld.elf_so
>  #4  0x00007f7f868007fd in _rtld_bind_start () from /usr/libexec/ld.elf_so
>  #5  0x0000000000000206 in ?? ()
>  #6  0x0000785c51a9043a in dup2 () from /usr/lib/libc.so.12
>  #7  0x0000785c51b18592 in je_jemalloc_prefork () from /usr/lib/libc.so.12
>  #8  0x0000785c5375c000 in ?? ()
>  #9  0x000000000000009c in ?? ()
>  #10 0x0000785c5260a0ee in pthread_sigmask () from /usr/lib/libpthread.so.1
>  #11 0x00000000521845fd in std::sys::unix::process::process_inner::<impl std::sys::unix::process::process_common::Command>::do_exec ()
...

I admit that I never understood in any meaningful manner what the
"dead lock detected!" error in ld.elf_so is actually objecting to.

I suspect that the condition starts with

"You cannot from two different threads in a process simultaneously
do <x>", but I've really never grasped what a list of the possible
"<x>" conditions are.

Looking at the code in rtld.c, it appears that the exclusive lock is
held while either loading a shared library (and its dependencies),
via dlopen(), or while doing the book-keeping for calling the
"_init" or "_fini" functions of any shared libraries (but not
actually while those functions are invoked(?)).  However, it's not
clear to me whether __HAVE_FUNCTION_DESCRIPTORS is defined or not,
and therefore, under which circumstances either the exclusive or
shared lock is used by e.g. do_dlsym().  And further, is any lock
taken when a given function in a shared library is called the first
time?  Or isn't ld.elf_so code involved in that at all?

And what happens if some shard locks are held, but you happen to
desire an exclusive lock?  I'm not able to tell from reading the
code...

Next, it's also not clear to me whether the restrictions imposed by
the locking in ld.elf_so are ... "reasonable", i.e. whether this can
be considered a bug in our ld.elf_so which we ought to fix, or
whether it's rust / cargo doing something it should not do (and if
that restriction is according to some standard or other, though
that's probably doubtful).

At a minimum, I'd say that the diagnostic could be better, i.e.
ld.elf_so ought to itself be able to tell which "<x>" condition is
violated, preferably in some terms that are more easily understood
than simply "dead lock detected".  To that end I've drafted the
attached diff to add some more verbosity, FWIW (compile tested on
i386 only, I've tentatively added x86_64 _rtld_bind, other CPUs need
similar treatment).

Regards,

- Håvard
Index: reloc.c
===================================================================
RCS file: /cvsroot/src/libexec/ld.elf_so/reloc.c,v
retrieving revision 1.110
diff -u -r1.110 reloc.c
--- reloc.c	27 Apr 2017 08:37:15 -0000	1.110
+++ reloc.c	15 Oct 2019 21:15:15 -0000
@@ -263,7 +263,7 @@
 	_rtld_shared_exit();
 	target = _rtld_call_function_addr(obj,
 	    (Elf_Addr)obj->relocbase + def->st_value);
-	_rtld_shared_enter();
+	_rtld_shared_enter("_rtld_resolve_ifunc done");
 
 	return target;
 }
Index: rtld.c
===================================================================
RCS file: /cvsroot/src/libexec/ld.elf_so/rtld.c,v
retrieving revision 1.183.4.2
diff -u -r1.183.4.2 rtld.c
--- rtld.c	29 Aug 2017 09:43:17 -0000	1.183.4.2
+++ rtld.c	15 Oct 2019 21:15:15 -0000
@@ -140,7 +140,7 @@
 {
 	_rtld_exclusive_exit(mask);
 	_rtld_call_function_void(obj, func);
-	_rtld_exclusive_enter(mask);
+	_rtld_exclusive_enter(mask, "initfini_done");
 }
 
 static void
@@ -372,7 +372,7 @@
 
 	dbg(("rtld_exit()"));
 
-	_rtld_exclusive_enter(&mask);
+	_rtld_exclusive_enter(&mask, "rtld_exit");
 
 	_rtld_call_fini_functions(&mask, 1);
 
@@ -735,7 +735,7 @@
 
 	_rtld_debug_state();	/* say hello to gdb! */
 
-	_rtld_exclusive_enter(&mask);
+	_rtld_exclusive_enter(&mask, "init functions");
 
 	dbg(("calling _init functions"));
 	_rtld_call_init_functions(&mask);
@@ -942,7 +942,7 @@
 
 	dbg(("dlclose of %p", handle));
 
-	_rtld_exclusive_enter(&mask);
+	_rtld_exclusive_enter(&mask, "dlclose");
 
 	root = _rtld_dlcheck(handle);
 
@@ -989,7 +989,7 @@
 
 	dbg(("dlopen of %s %d", name, mode));
 
-	_rtld_exclusive_enter(&mask);
+	_rtld_exclusive_enter(&mask, "dlopen");
 
 	flags |= (mode & RTLD_GLOBAL) ? _RTLD_GLOBAL : 0;
 	flags |= (mode & RTLD_NOLOAD) ? _RTLD_NOLOAD : 0;
@@ -1079,10 +1079,10 @@
 #endif
 
 #ifdef __HAVE_FUNCTION_DESCRIPTORS
-#define	lookup_mutex_enter()	_rtld_exclusive_enter(&mask)
+#define	lookup_mutex_enter(why)	_rtld_exclusive_enter(&mask, why)
 #define	lookup_mutex_exit()	_rtld_exclusive_exit(&mask)
 #else
-#define	lookup_mutex_enter()	_rtld_shared_enter()
+#define	lookup_mutex_enter(why)	_rtld_shared_enter(why)
 #define	lookup_mutex_exit()	_rtld_shared_exit()
 #endif
 
@@ -1099,7 +1099,7 @@
 	sigset_t mask;
 #endif
 
-	lookup_mutex_enter();
+	lookup_mutex_enter("do_dlsym");
 
 	hash = _rtld_elf_hash(name);
 	def = NULL;
@@ -1195,7 +1195,7 @@
 		if (ELF_ST_TYPE(def->st_info) == STT_GNU_IFUNC) {
 #ifdef __HAVE_FUNCTION_DESCRIPTORS
 			lookup_mutex_exit();
-			_rtld_shared_enter();
+			_rtld_shared_enter("resolve_ifunc");
 #endif
 			p = (void *)_rtld_resolve_ifunc(defobj, def);
 			_rtld_shared_exit();
@@ -1275,7 +1275,7 @@
 
 	dbg(("dladdr of %p", addr));
 
-	lookup_mutex_enter();
+	lookup_mutex_enter("dladdr");
 
 #ifdef __HAVE_FUNCTION_DESCRIPTORS
 	addr = _rtld_function_descriptor_function(addr);
@@ -1348,7 +1348,7 @@
 
 	dbg(("dlinfo for %p %d", handle, req));
 
-	_rtld_shared_enter();
+	_rtld_shared_enter("dlinfo");
 
 	if (handle == RTLD_SELF) {
 #ifdef __powerpc__
@@ -1397,7 +1397,7 @@
 
 	dbg(("dl_iterate_phdr"));
 
-	_rtld_shared_enter();
+	_rtld_shared_enter("dl_iterate_phdr");
 
 	for (obj = _rtld_objlist;  obj != NULL;  obj = obj->next) {
 		phdr_info.dlpi_addr = (Elf_Addr)obj->relocbase;
@@ -1436,7 +1436,7 @@
 
 	dbg(("__dl_cxa_refcount of %p with %zd", addr, delta));
 
-	_rtld_exclusive_enter(&mask);
+	_rtld_exclusive_enter(&mask, "__dl_cxa_refcount");
 	obj = _rtld_obj_from_addr(addr);
 
 	if (obj == NULL) {
@@ -1580,8 +1580,10 @@
 static volatile unsigned int _rtld_waiter_exclusive;
 static volatile unsigned int _rtld_waiter_shared;
 
+const char *exclusive_lock_reason;
+
 void
-_rtld_shared_enter(void)
+_rtld_shared_enter(const char *why)
 {
 	unsigned int cur;
 	lwpid_t waiter, self = 0;
@@ -1608,7 +1610,10 @@
 		if (cur == (self | RTLD_EXCLUSIVE_MASK)) {
 			if (_rtld_mutex_may_recurse)
 				return;
-			_rtld_error("dead lock detected");
+			if (exclusive_lock_reason)
+				_rtld_error("dead lock detected, want shared lock for %s, exclusive lock %s held", why, exclusive_lock_reason);
+			else
+				_rtld_error("dead lock detected, want shared lock for %s", why);
 			_rtld_die();
 		}
 		waiter = atomic_swap_uint(&_rtld_waiter_shared, self);
@@ -1652,7 +1657,7 @@
 }
 
 void
-_rtld_exclusive_enter(sigset_t *mask)
+_rtld_exclusive_enter(sigset_t *mask, const char *why)
 {
 	lwpid_t waiter, self = _lwp_self();
 	unsigned int locked_value = (unsigned int)self | RTLD_EXCLUSIVE_MASK;
@@ -1672,6 +1677,10 @@
 		membar_sync();
 		cur = _rtld_mutex;
 		if (cur == locked_value) {
+			if (exclusive_lock_reason)
+				_rtld_error("dead lock detected, want exclusive lock for %s, but exclusive lock %s held", why, exclusive_lock_reason);
+			else
+				_rtld_error("dead lock detected, want exclusive lock for %s", why);
 			_rtld_error("dead lock detected");
 			_rtld_die();
 		}
@@ -1682,6 +1691,7 @@
 		if (waiter)
 			_lwp_unpark(waiter, __UNVOLATILE(&_rtld_mutex));
 	}
+	exclusive_lock_reason = why;
 }
 
 void
@@ -1692,6 +1702,8 @@
 	membar_exit();
 	_rtld_mutex = 0;
 	membar_sync();
+	exclusive_lock_reason = NULL;
+
 	if ((waiter = _rtld_waiter_exclusive) != 0)
 		_lwp_unpark(waiter, __UNVOLATILE(&_rtld_mutex));
 
Index: rtld.h
===================================================================
RCS file: /cvsroot/src/libexec/ld.elf_so/rtld.h,v
retrieving revision 1.126.6.3
diff -u -r1.126.6.3 rtld.h
--- rtld.h	29 Aug 2017 09:43:17 -0000	1.126.6.3
+++ rtld.h	15 Oct 2019 21:15:15 -0000
@@ -378,9 +378,9 @@
 Objlist_Entry *_rtld_objlist_find(Objlist *, const Obj_Entry *);
 void _rtld_ref_dag(Obj_Entry *);
 
-void _rtld_shared_enter(void);
+void _rtld_shared_enter(const char *);
 void _rtld_shared_exit(void);
-void _rtld_exclusive_enter(sigset_t *);
+void _rtld_exclusive_enter(sigset_t *, const char *);
 void _rtld_exclusive_exit(sigset_t *);
 
 /* expand.c */
Index: tls.c
===================================================================
RCS file: /cvsroot/src/libexec/ld.elf_so/tls.c,v
retrieving revision 1.10.8.1
diff -u -r1.10.8.1 tls.c
--- tls.c	25 Jul 2017 01:36:58 -0000	1.10.8.1
+++ tls.c	15 Oct 2019 21:15:15 -0000
@@ -63,7 +63,7 @@
 	void **dtv, **new_dtv;
 	sigset_t mask;
 
-	_rtld_exclusive_enter(&mask);
+	_rtld_exclusive_enter(&mask, "_rtld_tls_get_addr");
 
 	dtv = tcb->tcb_dtv;
 
@@ -157,7 +157,7 @@
 	struct tls_tcb *tcb;
 	sigset_t mask;
 
-	_rtld_exclusive_enter(&mask);
+	_rtld_exclusive_enter(&mask, "_rtld_tls_allocate");
 	tcb = _rtld_tls_allocate_locked();
 	_rtld_exclusive_exit(&mask);
 
@@ -171,7 +171,7 @@
 	uint8_t *p, *p_end;
 	sigset_t mask;
 
-	_rtld_exclusive_enter(&mask);
+	_rtld_exclusive_enter(&mask, "_rtld_tls_free");
 
 #ifdef __HAVE_TLS_VARIANT_I
 	p = (uint8_t *)tcb;
Index: arch/i386/mdreloc.c
===================================================================
RCS file: /cvsroot/src/libexec/ld.elf_so/arch/i386/mdreloc.c,v
retrieving revision 1.37.8.1
diff -u -r1.37.8.1 mdreloc.c
--- arch/i386/mdreloc.c	4 Jul 2017 12:47:59 -0000	1.37.8.1
+++ arch/i386/mdreloc.c	15 Oct 2019 21:15:15 -0000
@@ -260,7 +260,7 @@
 
 	new_value = 0;	/* XXX gcc */
 
-	_rtld_shared_enter();
+	_rtld_shared_enter("_rtld_bind");
 	err = _rtld_relocate_plt_object(obj, rel, &new_value);
 	if (err)
 		_rtld_die();
Index: arch/x86_64/mdreloc.c
===================================================================
RCS file: /cvsroot/src/libexec/ld.elf_so/arch/x86_64/mdreloc.c,v
retrieving revision 1.41.8.1
diff -u -r1.41.8.1 mdreloc.c
--- arch/x86_64/mdreloc.c	4 Jul 2017 12:47:59 -0000	1.41.8.1
+++ arch/x86_64/mdreloc.c	15 Oct 2019 21:15:15 -0000
@@ -342,7 +342,7 @@
 
 	new_value = 0; /* XXX GCC4 */
 
-	_rtld_shared_enter();
+	_rtld_shared_enter("_rtld_bind");
 	error = _rtld_relocate_plt_object(obj, rela, &new_value);
 	if (error)
 		_rtld_die();


Home | Main Index | Thread Index | Old Index