Subject: Re: memory mapped space
To: Robert Dobbs <banshee@gabriella.resort.com>
From: John F. Woods <jfw@jfwhome.funhouse.com>
List: port-i386
Date: 11/09/1995 13:48:42
> I'm slightly uncertain how memory mapped spaces work.
> Essentially, they map a local buffer over some memory space on the
> i/o device in question, such that reads or writes from the local buffer
> access the device's memory space.

Memory mapped spaces work however the hardware designer decided they
will work.  This would be an obvious statement if hardware designers
didn't tend to be such odd people ;-).  You may also be confusing
"memory-mapped I/O" and memory-mapped buffers; I'll treat the latter
first, then come back to the former.

Usually, a memory-mapped device (like, for example, an ethernet card
with buffers) will have some onboard RAM which the I/O hardware can
directly and efficiently access (so, for example, you set up LANCE
ring buffers in the device's local memory when the LANCE can get at it
easily); to make it easy for the CPU to use that memory, you then
arrange for an otherwise unused section of physical address space to
be mapped to that device's memory, so that memory reads and writes to
those addresses are satisfied from the board.  You could use such an
ethernet board as a very expensive way to expand the memory of a
system by a truly pathetic amount, if you didn't want the ethernet
functionality as well :-).

The alternative strategies consist of  (1) having the I/O hardware
directly access main memory (which can be tricky to design, especially
on a grossly inadequate bus like the ISA bus, and some I/O devices
have tight latency requirements that a busy system bus can have
difficulty satisfying; this leads inevitably to the temptation to have
the I/O card understand mbufs and take a hand in creating mbuf chains
for arriving packets -- which then leads to disaster and anguish when
the mbuf layout is modified to improve performance...

And (2), having no memory mapping capability in either direction, and
instead relying on DMA; packets (for a network) would be received into
on-board memory, and when complete would be DMAd from that memory into
main memory, at the direction of the main CPU or the board's
microcontroller (depending on the design of the board and bus).  This
is good for cases where you may need a lot more on-board buffering
than is convenient for direct mapping (like a HIPPI board where if you
need buffering at all, you probably need megabytes of it), but suffers
from the two-step nature of the transfer:  first data has to land in
on-board memory, then someone has to notice that fact and then arrange
for it to land in main memory.  Usually, you'd either like to use it
straight out of the board's memory (memory mapped buffer) or have it
land in main memory to start with (alternative 1).

> So lets say I have 20 bytes in the device's buffer; if I read 10bytes
> from the buffer, will the following 10 bytes be available from the start
> of the buffer at the next read?  ie: is this a ring?

If it's a true memory mapped buffer, then not only would the next ten
bytes be available starting at the next address, but the first ten
bytes would probably still be there; you address the data like memory.
You could address every odd byte of a packet, then read every even
byte backward from the end, if it suited you.  (If you're talking
about the read() system call, then the hardware underneath is
irrelevant; reading ten bytes should deliver the current ten bytes,
and leave the next ten bytes for the next read() (unless it shouldn't
:-), like when you might want to discard the remainder of a packet if
you refuse to read it all).

That's why I wondered if you really meant "memory mapped I/O", which
is a phrase used to distinguish the correct implementation from I/O
instructions :-).  You can make device registers respond to memory
addresses instead of (or in addition to) specialized I/O instructions
(and many processor families *have* no special I/O instructions).  In
that case, the behavior of (say) a FIFO chip to reading ten times the
memory address corresponding to its output register would be identical
to doing an IN instruction to it ten times, leaving whatever data was
still in the FIFO for later unless you hit some other register to dump
it.  (Here's where the bizarre nature of hardware design comes in; a
hardware designer might make a range of addresses decode to the same
device register to simplify decoding, or might even have the range of
addresses decode to the same register but perform other functions as
well; a FIFO might, for example, be wired up to hand you the current
byte if you read either address 0xFFFF4444 or 0xFFFF4445, but
0xFFFF4445 could possibly be wired to flush the FIFO after the read
(potentially useful for the case where you want to automatically dump
padding at the end of a packet without having to do extra work to
empty the FIFO).

Hope this helps.