Subject: Re: Is the netBSD kernel Preemptible ?
To: NetBSD Performance Technical Discussion List <tech-perform@NetBSD.ORG>
From: Greg A. Woods <woods@weird.com>
List: tech-perform
Date: 06/14/2002 20:25:12
[ On Friday, June 14, 2002 at 23:25:21 (+0100), David Laight wrote: ]
> Subject: Re: Is the netBSD kernel Preemptible ?
>
> Or any other SMP system?
> I think I started writing SMP drivers in 1994!

I'm pretty sure I wrote a driver for an SMP system about the same time,
if not even before, but since I was just using that system's DDI/DKI
specification it was a no-brainer (i.e. no special SMP stuff -- I tested
the driver on a single processor system right up to about a week before
it went into use on the SMP system).

> > and say that since device drivers
> > are merely subroutines called by the kernel at appropriate places that
> > it should not be necessary to do anything to make them SMP compatible
> > unless they reach back into bits of kernel storage that does have SMP
> > interlock requirements.  I.e. if a driver is well behaved and just does
> > hardware manipulation then it should be fine.
> 
> Er no - there is not necessarily anything to stop you driver
> code being called at the same time on multiple CPUs (even for
> the device).  Any data structures have to be locked - the effects
> of getting it wrong are very difficult to pin down.

Not with the implementation I worked with -- IIRC the DDI specification
ensured the driver entry point could not be used by any more than one
CPU at a time, which is I think the only sane way to write drivers that
are portable across MP, SMP, and single-CPU systems.  Bach and Buroff
described this technique (and the re-write of sleep()/wakeup() to use a
hashed semaphore pool) back in 1984 when they reported on the various
multi-processor ports done back then.  The driver I wrote worked on one
of those systems.

Your reply prompted me to look up the details of SysVr4/MP in Valhalia
(UNIX Internals), and I do see discussion about modifications for
"MP-safe" drivers in the SysVr4/MP DDI/DKI.  I suppose for some DDI
calls it makes sense to have reentrancy (eg. ioctl(), and maybe open()),
but I really don't see a need for it for most other calls -- a single
semaphore around the driver entry point that the CPU must acquire before
calling the driver seems as if it would more than sufficient -- after
all most drivers will have to lock something critical almost immediately
as they begin and will likely keep that lock until the call "returns".

Still I think Bach and Buroff identify the more important rule of thumb
here when they say:

    But more than half of the UNIX operating system currently consists
    of device drivers, and new drivers are being added at an
    accelerating rate to support new peripherals and to provide new or
    enhanced services.  In practice, therefore, the number and
    volatility of the drivers make it difficult to change them for
    multiprocessor systems and keep them up to date with changes made
    for other UNIX systmes, so it is important to keep most driver code
    identical over all implementations.

That's from "Multiprocessor UNIX Operating Systems" in the Oct. 1984
edition of the BSTJ.

They do say that I/O bound jobs don't do quite so well as CPU bound jobs
(which with their state of the art were running at 1.7 times the
throughput on a two-CPU system as on a single CPU system).  Still I'd
like to see some numbers on modern hardware before I would go so far as
to admit that MP-specific driver coding is really worth the effort.

> > Unfortunately I don't believe there's yet a well defined Device Driver
> > Kernel Interface specification so it's hard to know whether all the
> > necessary routines are SMP compatible and whether or not a given driver
> > is DDK compliant (and thus implicitly SMP compatible or not).
> 
> In the end most things have to do the required locking for SMP.
> For certain things (maybe simple device drivers) a DDK interface
> can be used to tell the kernel that a particular driver isn't MP
> clean - so the kernel can apply a global lock on the calls into that
> driver.

Your use of the word "global" is very disturbing and, IMNSHO, incorrect.

>  However this will only work if the kernel knows where these
> entry points are.

The driver entry points are obviously very well known by the kernel.
They're identical for all drivers.  Even in *BSD this part of the device
driver API is _very_ well defined.  The kernel really cannot call a
driver routine that it doesn't already know about!

(Valhalia lists 11 driver entry points for SysVr4, one used only for
block devices, one used only by disk devices, two used only by character
devices, and one of two used only for memory-mapped character devices.
The BCI documentation for SysVr3 lists 13 driver entry points, and IIRC
there are actually a couple more for SysVr4 too.)

-- 
								Greg A. Woods

+1 416 218-0098;  <gwoods@acm.org>;  <g.a.woods@ieee.org>;  <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>