Subject: Re: NetBSD i386 bounce-buffer non-feature [was Re: Memory leak?]
To: None <current-users@NetBSD.ORG>
From: John F. Woods <jfw@jfwhome.funhouse.com>
List: current-users
Date: 02/09/1996 07:59:21
> I don't think that an OS should use kludges to make up for underlying
> hardware limits, at least not on cheap, Intel commodity hardware.

Making up for underlying hardware limits is ALL THAT OPERATING SYSTEMS DO.

> I take a certain interest in news postings that describe the bounce
> buffer support in FreeBSD and Linux as performance killers.

Someone who believes that bounce-buffered DMA is exactly as fast as direct
DMA will probably be disappointed.  But considering the amount of paging
MY system does when X is running, and considering how much less it would do
with another 16M of memory (or even 4M of memory), I'd bet dollars to holes
in donuts that it's faster to bcopy 4K of memory *once* than to read and write
it to disk dozens of times.

As much as I hate to say anything good about Windows NT, I think it has a
reasonably clean architectural solution to the problem.  *All* NT drivers are
required to be written assuming that there are bus mapping registers between
I/O busses and main memory (just like a UNIBUS map, for old PDP-11 hacks).
You set up DMA by allocating map registers for your transfer (and get back
translated I/O-bus addresses for use by the DMA device), you do the DMA,
then you release the registers.  The routines for allocating and deallocating
these registers are told the bus type you're on (and which bus, if a system
has several of a given type), and the generic routines you call then call
bus-specific routines to do the actual work.  The ISA-specific routines,
of course, don't actually allocate MAP registers but instead (may) allocate
bounce buffers; the de-allocate routines perform the copy "behind your back".
(Of course, for straight-through busses, these bus-specific routines do
nothing.)  One uniform architectural view which is flexible enough to cover
most realistic system architectures.

Of course, one might complain that having to pretend that you're allocating
and deallocating registers is a waste of CPU time for PCI or EISA busses.
On the other hand, even with NT's large installed base, I bet these routines
have not wasted as much total CPU time as this list has wasted in composing
and transporting email complaining about bounce buffering...