tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: assertion "solocked(sb->sb_so)" failed [Re: diagnostic assertion "solocked2(so, so2)" failed]



On Sun, Aug 16, 2009 at 09:30:49PM +0200, Manuel Bouyer wrote:
> On Wed, Aug 12, 2009 at 04:53:19PM +0200, Manuel Bouyer wrote:
> > On Tue, Aug 11, 2009 at 11:09:32AM +0200, Manuel Bouyer wrote:
> > > Hi,
> > > I got this panic on a netbsd-5 kernel:
> > > panic: kernel diagnostic assertion "solocked2(so, so2)" failed: file 
> > > "/home/bouyer/src-5/src/sys/kern/uipc_usrreq.c", line 559                 
> > >                  
> > 
> > ad@ pointed me to kern/38968 which is the same problem. From his
> > analysis it's a broken assertion. I could just remove the KASSERT
> > but I suspect other assertions using solocked2() are also broken; this needs
> > to be analysed. For now I'm just going to change solocked2() to always 
> > return
> > true, as suggected by ad.
> 
> After doing so I got this panic:
> 
> panic: kernel diagnostic assertion "solocked(sb->sb_so)" failed: file 
> "/home/bouyer/src-5/src/sys/kern/uipc_socket2.c", line 726                    
>             
> fatal breakpoint trap in supervisor mode
> trap type 1 code 0 eip c03d033c cs 8 eflags 246 cr2 8077132 ilevel 4
> Stopped in pid 16273.1 (sendmail) at    netbsd:breakpoint+0x4:  popl    %ebp
> db{1}> 
> breakpoint(c0634177,d5ad8b18,c2cca800,c02ffb02,cbf35d82,c32b003c,ffffffff,d5ad8c10,d5ad8c74,d5ad8c94)
>  at netbsd:breakpoint+0x4
> panic(c0673c0c,c05f9751,c05fb357,c062a5f0,2d6,1,d5ad8b4c,c035ede2,c05f9751,c062a5f0)
>  at netbsd:panic+0x1b0
> __kernassert(c05f9751,c062a5f0,2d6,c05fb357,c31bfba4,c2f94300,d5ad8b7c,c03645e1,c31bfc5c,c2e6a200)
>  at netbsd:__kernassert+0x39
> sbappend(c31bfc5c,c2e6a200,c2e6a23d,c,d1d186ac,d23d696c,c2e6a249,d1ded2c0,0,0)
>  at netbsd:sbappend+0x92
> uipc_usrreq(c31b6a54,9,c2e6a200,0,0,d18ad300,3,ad8be0,c31b6aac,0) at 
> netbsd:uipc_usrreq+0x741
> sosend(c31b6a54,0,d5ad8c94,c2e6a200,0,0,d18ad300,9,0,2) at netbsd:sosend+0x403
> soo_write(d5bd6300,d5bd6300,d5ad8c94,d5a81000,1,d5ad8bdc,d5bd6300,d5ad8c54,0,10)
>  at netbsd:soo_write+0x3e
> do_filewritev(5,bfbfc370,2,d5bd6300,1,d5ad8d28,d5ad8d3c,c03d88e8,d18ad300,d5ad8d00)
>  at netbsd:do_filewritev+0x190
> sys_writev(d18ad300,d5ad8d00,d5ad8d28,8077000,d1d186ac,d1d186ac,1,5,bfbfc370,2)
>  at netbsd:sys_writev+0x37
> syscall(d5ad8d48,b3,ab,2b,2b,1,bfbfc3d8,bfbfc398,bb92a0b0,0) at 
> netbsd:syscall+0xc8
> 
> this sbappend() is called a few lines after the KASSERT(solocked2(so, so2))
> that was triggered before. m is not NULL. At this point this KASSERT
> doens't look bogus, it doesn't look safe to get there without a lock held.

I think I found a problem with the way socket locks are changed
in uipc_usrreq.c: unp_setpeerlocks() takes both sockets locked by the
uipc_lock. But on return the sockets may be unlocked: a
KASSERT(mutex_owned(lock) at the end of unp_setpeerlocks() fired as
soon as sendmail started talking with milters:
__kernassert(c05f97b1,c062a938,fe,c062a7a7,0,d1149000,c2ecbb80,0,c2ecbb00,0) at 
netbsd:__kernassert+0x39
unp_setpeerlocks(c2ee7158,c2cca000,0,80,14,0,d1a5bc2c,c2ee7158,c2ee7158,4) at 
netbsd:unp_setpeerlocks+0x11a
uipc_usrreq(c2ee7158,5,0,c2e65a00,0,0,0,d1862d00,d1862d00,d1862d00) at 
netbsd:uipc_usrreq+0x294
soaccept(c2ee7158,c2e65a00,d1a5bc8c,c0319ec7,d1a5bc9c,c2e65a00,c2ee73d8,c0340e8f,d1a5bccc,7)
 at netbsd:soaccept+0x60
do_sys_accept(d1870580,4,d1a5bccc,d1a5bd28,d1a5bccc,bfbfed54,8,0,0,0) at 
netbsd:do_sys_accept+0x1e6
sys_accept(d1870580,d1a5bd00,d1a5bd28,bb904000,d1a7fe1c,d1a7fe1c,2,4,0,0) at 
netbsd:sys_accept+0x31                   
syscall(d1a5bd48,ffff00b3,ab,bfbf001f,bbbc001f,804c23c,4,bfbfecb8,bb905040,0) 
at netbsd:syscall+0xc8


After calling unp_setpeerlocks() uipc_usrreq(PRU_ACCEPT) will call
unp_setaddr() which, I think, expects the socket to be locked as
it does a sounlock()/solock() in a (probably rarely used) code path.
However, if unp_setpeerlocks() takes the new lock I'm can't find where
it would be released as do_sys_accept() releases the lock on so and not
so2.

Also, unp_setpeerlocks() is called from unp_connect2(PRU_CONNECT),
and the caller, unp_connect(), expects the sockets to be locked on
return as it does a sounlock(). If I got it right, this is true because
we do solock(so); unp_resetlock(so); in unp_connect(), thus
unp->unp_streamlock points to a locked lock.

I'm not sure how to sort this. Should uipc_usrreq(PRU_ACCEPT)
grab the unp->unp_streamlock before calling unp_setpeerlocks() and release
it after ?

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index