tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: assertion "solocked(sb->sb_so)" failed [Re: diagnostic assertion "solocked2(so, so2)" failed]
On Mon, Aug 17, 2009 at 12:19:15AM +0200, Manuel Bouyer wrote:
> > > but I suspect other assertions using solocked2() are also broken; this
> > > needs
> > > to be analysed. For now I'm just going to change solocked2() to always
> > > return
> > > true, as suggected by ad.
> >
> > After doing so I got this panic:
> >
> > panic: kernel diagnostic assertion "solocked(sb->sb_so)" failed: file
> > "/home/bouyer/src-5/src/sys/kern/uipc_socket2.c", line 726
> >
> > [...]
> > this sbappend() is called a few lines after the KASSERT(solocked2(so, so2))
> > that was triggered before. m is not NULL. At this point this KASSERT
> > doens't look bogus, it doesn't look safe to get there without a lock held.
On second though it's also an issue with solocked(), but the code
should be safe. Here's a possible senario: when we do unp_setpeerlocks(),
we do unp_resetlock(so); at this point the thread doing a write on the socket
can grab the lock and go on. When we call solocked() on the remote end,
so_lock still points to uipc_lock because the first thread has not done the
unp_resetlock(so2) yet. This threads loads so_lock in a register and
is preempted. When it comes back, the first threads has completed the
unp_resetlock(so2) and released uipc_lock. the KASSERT fires but at this
time so_lock doens't point to the checked lock any more.
this means that because of this race, solocked() and solocked2() are
not reliable.
>
> I think I found a problem with the way socket locks are changed
> in uipc_usrreq.c: unp_setpeerlocks() takes both sockets locked by the
> uipc_lock. But on return the sockets may be unlocked: a
> KASSERT(mutex_owned(lock) at the end of unp_setpeerlocks() fired as
> soon as sendmail started talking with milters:
> __kernassert(c05f97b1,c062a938,fe,c062a7a7,0,d1149000,c2ecbb80,0,c2ecbb00,0)
> at netbsd:__kernassert+0x39
> unp_setpeerlocks(c2ee7158,c2cca000,0,80,14,0,d1a5bc2c,c2ee7158,c2ee7158,4) at
> netbsd:unp_setpeerlocks+0x11a
> uipc_usrreq(c2ee7158,5,0,c2e65a00,0,0,0,d1862d00,d1862d00,d1862d00) at
> netbsd:uipc_usrreq+0x294
> soaccept(c2ee7158,c2e65a00,d1a5bc8c,c0319ec7,d1a5bc9c,c2e65a00,c2ee73d8,c0340e8f,d1a5bccc,7)
> at netbsd:soaccept+0x60
> do_sys_accept(d1870580,4,d1a5bccc,d1a5bd28,d1a5bccc,bfbfed54,8,0,0,0) at
> netbsd:do_sys_accept+0x1e6
> sys_accept(d1870580,d1a5bd00,d1a5bd28,bb904000,d1a7fe1c,d1a7fe1c,2,4,0,0) at
> netbsd:sys_accept+0x31
> syscall(d1a5bd48,ffff00b3,ab,bfbf001f,bbbc001f,804c23c,4,bfbfecb8,bb905040,0)
> at netbsd:syscall+0xc8
>
>
> After calling unp_setpeerlocks() uipc_usrreq(PRU_ACCEPT) will call
> unp_setaddr() which, I think, expects the socket to be locked as
> it does a sounlock()/solock() in a (probably rarely used) code path.
> However, if unp_setpeerlocks() takes the new lock I'm can't find where
> it would be released as do_sys_accept() releases the lock on so and not
> so2.
>
> Also, unp_setpeerlocks() is called from unp_connect2(PRU_CONNECT),
> and the caller, unp_connect(), expects the sockets to be locked on
> return as it does a sounlock(). If I got it right, this is true because
> we do solock(so); unp_resetlock(so); in unp_connect(), thus
> unp->unp_streamlock points to a locked lock.
>
> I'm not sure how to sort this. Should uipc_usrreq(PRU_ACCEPT)
> grab the unp->unp_streamlock before calling unp_setpeerlocks() and release
> it after ?
The attached patch does this, and seems to work fine. It should make
solocked() and solocked2() reliable, and also fix the unlock/lock issue
in unp_setaddr() for the PRU_ACCEPT case.
Any comment on this ?
--
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
NetBSD: 26 ans d'experience feront toujours la difference
--
Index: uipc_usrreq.c
===================================================================
RCS file: /cvsroot/src/sys/kern/uipc_usrreq.c,v
retrieving revision 1.119.4.2
diff -u -r1.119.4.2 uipc_usrreq.c
--- uipc_usrreq.c 18 Mar 2009 05:33:23 -0000 1.119.4.2
+++ uipc_usrreq.c 17 Aug 2009 11:32:37 -0000
@@ -251,6 +251,11 @@
unp->unp_streamlock = NULL;
mutex_obj_hold(lock);
membar_exit();
+ /*
+ * possible race if lock is not held - see comment in
+ * uipc_usrreq(PRU_ACCEPT).
+ */
+ KASSERT(mutex_owned(lock));
solockreset(so, lock);
solockreset(so2, lock);
}
@@ -327,6 +332,7 @@
struct unpcb *unp;
bool ext;
+ KASSERT(solocked(so));
unp = sotounpcb(so);
ext = false;
@@ -443,7 +449,17 @@
* If the connection is fully established, break the
* association with uipc_lock and give the connected
* pair a seperate lock to share.
+ * There is a race here: sotounpcb(so2)->unp_streamlock
+ * is not locked, so when changing so2->so_lock
+ * another thread can grab it while so->so_lock is still
+ * pointing to the (locked) uipc_lock.
+ * this should be harmless, exept that this makes
+ * solocked2() and solocked() unreliable.
+ * Another problem is that unp_setaddr() expects the
+ * the socket locked. Grabing sotounpcb(so2)->unp_streamlock
+ * fixes both issues.
*/
+ mutex_enter(sotounpcb(so2)->unp_streamlock);
unp_setpeerlocks(so2, so);
/*
* Only now return peer's address, as we may need to
@@ -454,6 +470,7 @@
* error == 0 and sun_noname as the peer address.
*/
unp_setaddr(so, nam, true);
+ mutex_exit(so2->so_lock);
break;
case PRU_SHUTDOWN:
Home |
Main Index |
Thread Index |
Old Index