kernel_lock, splbio() and SMP_SAFE

To: tech-kern%netbsd.org@localhost
Subject: kernel_lock, splbio() and SMP_SAFE
From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
Date: Mon, 3 Aug 2009 13:37:56 +0200

Hello,
there's one thing I didn't get in the new world order of the NetBSD
kernel (on i386, FWIW):
given a low-level subsystem (say, a driver) which is not SMP-safe
and is using the old spl() locking scheme, how is this subsystem
protected from SMP-save upper-level subsystems calling one of its
entry point ? I would expect kernel_lock() to be taken at some
point, but I can't find where it's happening in this case.

I added
KASSERT(__SIMPLELOCK_LOCKED_P(kernel_lock));
in wdstart() and sdstart(), and ended up with:
root on raid0a dumps on raid0b
panic: kernel diagnostic assertion "__SIMPLELOCK_LOCKED_P(kernel_lock)" failed: 
file "/dsk/l1/misc/bouyer/netbsd-5/src/sys/dev/ata/wd.c", line 620
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c03f26dc cs 8 eflags 246 cr2 0 ilevel 6
Stopped in pid 0.1 (system) at  netbsd:breakpoint+0x4:  popl    %ebp
db{1}>  tr
breakpoint(c0646f16,c07c69d4,c2cf2800,0,2,0,c07c6a08,c035190f,5f,0) at 
netbsd:breakpoint+0x4
panic(c068fb90,c061b30d,c065b8f8,c065c99c,26c,0,c07c6a08,c041bf45,c061b30d,c065c99c)
 at netbsd:panic+0x1b0
__kernassert(c061b30d,c065c99c,26c,c065b8f8,0,c2e68a1c,c07c6a48,c041c437,ce31a7d0,c2e68a1c)
 at netbsd:__kernassert+0x39
wdstart(ce31a7d0,c2e68a1c,1,0,6,8,c07c6a48,400,5f,0) at netbsd:wdstart+0xf5
wdstrategy(c2e68a1c,c0713fd8,c07c6a98,0,0,c2df8000,c07c6c38,c0186c9a,8,ce322678)
 at netbsd:wdstrategy+0x1c7
raidread_component_label(8,ce322678,c07c6aa4,c0315077,0,c069ff20,0,0,4,4) at 
netbsd:raidread_component_label+0x69
rf_markalldirty(1203,c018a550,c2df5200,c2df5000,c070bf80,c0709140,cd5ee240,c2cf2918,c0709140,4)
 at netbsd:rf_markalldirty+0x9a
raidopen(1201,0,6000,c069ff20,0,c2df7000,0,0,14,0) at netbsd:raidopen+0x240
raidsize(1201,0,0,0,0,14,c07c6d38,c030a04b,0,0) at netbsd:raidsize+0x102
cpu_dumpconf(0,0,14,0,0,c03098a0,0,0,c07090b8,0) at netbsd:cpu_dumpconf+0x37
main(0,c01002a7,0,0,0,0,0,0,0,0) at netbsd:main+0x28b

I can't believe we have such a big synchronisation issue in our kernel
that would have been unnoticed for so long. Can someone explain me how this is
supposed to work ? 

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--

Follow-Ups:
- Re: kernel_lock, splbio() and SMP_SAFE
  - From: Mindaugas Rasiukevicius
- Re: kernel_lock, splbio() and SMP_SAFE
  - From: Manuel Bouyer

Prev by Date: Re: Support for multi-position electro-mechanical keylocks
Next by Date: Re: Support for multi-position electro-mechanical keylocks
Previous by Thread: Support for multi-position electro-mechanical keylocks
Next by Thread: Re: kernel_lock, splbio() and SMP_SAFE
Indexes:

Home | Main Index | Thread Index | Old Index