tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: How to prevent a mutex being _enter()ed from being _destroy()ed?



> Date: Fri, 10 Aug 2018 19:48:40 +0200
> From: Edgar Fuß <ef%math.uni-bonn.de@localhost>
> 
> > Yes -- isn't that the symptom you're seeing, or did I miss something?
> It's the mutex_oncpu in the while condition that crashes, not the on in the
> if condition above the do.

Are you sure it _only_ happens in the do/while and _never_ in the
preceding if?

I don't see any reason to distinguish ordering on other CPUs between
the two calls to mutex_oncpu.  It is possible that the spinlock
backoff in the do/while opens a window for a race condition wide
enough that you only see it in the do/while condition.

> > It doesn't really matter since (a) only one thread ever sets the
> > variable, (b) there are no invariants around it, and (c) you never
> > dereference it.  So, as soon as unp_gc decides it will use a
> > particular socket, it should just store the pointer to that socket in
> > some global unp_gc_current_socket, and when it's done (before closef),
> > it should set unp_gc_current_socket to null; then in soput/sofree,
> > just KASSERT(so != unp_gc_current_socket).
> But couldn't the thread that KASSERTs read a stale copy that unp_gc() 
> nulled out but the null value didn't make it to the right CPU/cache/whatever?

Conceivably, yes.  Then you would have a false positive for your test.
I would guess that (a) that won't happen a lot, and (b) it'll be clear
on scrutiny that it's a false positive.

But I also don't think it's very likely to have false positives,
because in correct code, soclose won't be called with a socket
associated with a file that has a positive reference count, which is
all managed under fp->f_lock.

In any case, it's just a diagnostic, not a protocol for a robust
software system to rely on.  If it doesn't work, can try another one.


Home | Main Index | Thread Index | Old Index