Subject: Re: Swap beginning with cylinder 0
To: None <tech-kern@NetBSD.ORG>
From: der Mouse <mouse@Collatz.McRCIM.McGill.EDU>
List: tech-kern
Date: 05/22/1996 20:48:35
After exchanging some email with Theo (Theo! gasp! :-) on the subject
of this can of worms about reserving part of a swap area to avoid
swapping over disk labels and the like, the alternative that seems best
to me is to define a new ioctl for disk drivers.  If we call this
ioctl, for the sake of argument, DIOCSWAPSKIP, then the idea is that
the swap code does a DIOCSWAPSKIP on the swap device to find out how
much it should skip.

In my opinion, the only real choice is to push it off into the driver.
The amount to reserve has to be specific to at least the port, since,
for example, the Amiga uses a comparatively large amount of space, two
tracks or something - or so I'm told - while the sun3 port uses a fixed
8K and the i386 port uses ghod only knows how much, probably depending
on whether it's a floppy or not or something equally weird.  And while
it could just be a port-specific constant, that can end up wasting a
lot of space for ports that need a lot of space on boot disks but not
when swap isn't at the beginning of the disk or when the disk isn't a
boot disk or some such...and _definitely_ ends up wasting space when
swapping on a vnd.  And then there are ccds to consider; if a striped
ccd component is being swapped on, and a component includes a boot
area, you have to either skip multiple pieces of the ccd or you have to
skip approximately N times as much where N is the number of components.

Of course, if the ioctl fails ENOTTY, it has to fall back on something.
A port-specific constant seems best to me at the moment.  I can see
only a few options:

- a port-specific constant
- a globally-chosen constant
- ask the driver for the label and skip based on that (one cylinder?)

What we have now is a port-specific constant, except it's not a
_relevant_ port-specific constant; ctod(CLSIZE) is the number of disk
blocks in a VM-subsystem "page", as I read it, which does not
necessarily bear any relation to the size of a disk boot area.  (It's
also not, technically, always constant - witness the SPARC port if all
the CPU types are defined - though for purposes of this discussion it
can be considered constant.)

I'd say that ideally the ioctl should return the amount of space to
skip, or - largely for the benefit of striped ccds - it should do the
rmfree() call or calls itself and return a sentinel value like -1.  But
making this special case work right with a non-SEQSWAP setup that swaps
on multiple devices could be hard enough to make it not worth doing.

No, I don't expect any of this to make it into 1.2. :-)

					der Mouse

			    mouse@collatz.mcrcim.mcgill.edu