Subject: Re: MFS over ISO-9660 union mounted with no swap space?
To: Mike Cheponis <mac@Wireless.Com>
From: Gandhi woulda smacked you <greywolf@starwolf.com>
List: tech-kern
Date: 05/13/1999 17:50:42
On Wed, 12 May 1999, Mike Cheponis wrote:

# I think if we re-frame this, we end up witn a non-issue.
# 
# Firstly, the notion of "swap" is antiquated and has no purpose in an OS on
# the verge of the 21st century.
# 
# The correct way to do this, IMHO, is to use unallocated filesystem space
# as swap space.

Too much translation overhead.  The advantages of having a raw partition
is that you don't have to go through the filesystem to access the swap
space, and as an aside on the same level, you don't have to worry too much
about what you're tromping on (the paging code keeps track of what's
used and what isn't), so you won't be tromping inode blocks or whatever.

The paging daemon worries about paging; don't make it worry about
the filesystem, too (yes, I know this is kind of what happens when
one uses a swap file, which is why I don't use them).

# 
# The bit vectors that hold allocation status are in memory (as well as on
# disk).  When some disk space is needed for "swap" then only the in-memory
# allocation status vector is changed to reflect that some piece of the
# disk is "used" and can't be allocated by real files.

Yeah, but you can end up with a lot of fragmentation this way.

# 
# Since this is in-memory, it cleanly survives a crash, because on disk these
# parts used as "swap" are not shown as allocated.
# 
# This method has the further advantage that (on a disk that's not nearly
# full) you can often locate what I will call this "Dynamic Swap Space" near
# the data files that are being used, and therefore, minimize total disk I/O
# time compared with a fixed swap allocation scheme.

Okay, you have a process that has just allocated a huge chunk-o-swap.
Now another process wants to allocate some of that filesystem space
as a file.

Who's going to lose, the one that got the page, or the one that wants
the file?

What if the process that wanted the file had just been told that
the space was available?

# 
# It -is- true, as in any buffer stacking situation, some run-amok process
# could request infinite DynaSwap space; in that case, pointing the kill -9
# at the next process that requests another page seems like a reasonable
# strategy.
# 
# Lastly, I think it's totally bogus on a VM system if I can't have an array
# that uses up the whole disk if I want, so the programmer has this abstraction
# of a Very Large Memory.  In braindead architectures like the i386, I'm
# probably limited to 4 GB, but in reasonable architectures, I don't see why
# there needs to be any other limit than available disk space.

Which is why we have swap files if you really need them.  Once we can
DEallocate swap, that will actually be a win, but if you're going
to do that, DynaSwap is a lose (see above), UNLESS you want to have it
available as a tunable parameter for a LIVE filesystem, in which case
you know what you're doing with your system.  If my system runs out
of filesystem space because some process snarfs it up for swap space
between the time I ran a statfs() call which returns a suitable
amount of space left and a write() sequence which fails, I'm going
to be justifiably pissed off unless I've set my system up to
do this.

I tried to suggest that we claim inode blocks and convert them to
data blocks in times of dire need, but the return between i-blocks
and d-blocks is peanuts (it takes 32 inodes to make a data block
on a 4k filesystem) and reallocation of freed data blocks to inodes
proves problematic largely due to fragmentation and placement issues.

What you're suggesting is very close in principle to what I suggested.

# 
# -Mike Cheponis
# 
# 
# 
# On Wed, 12 May 1999, Chuck Silvers wrote:
# 
# > "Erik E. Fair" writes:
# > > As for the swap question, it really devolves down to the resource
# > > accounting issue, which you get into whether you have swap or not: how well
# > > does NetBSD deal when there is no more RAM (or RAM+swap)?
# > > 
# > > I know that the MACH VM had this notion of "lazy allocation" which allowed
# > > for large, sparse address spaces, but lost terribly if a resource crunch
# > > hit. I also know that we've been tightening up on the resource accounting
# > > with UVM, but it's not clear to me what we do for the "sorry, we're all
# > > out" case right now.
# > > 
# > > I'm all in favor of the suggestion that was made the last time this came
# > > up: a VM with two behavior modes: strict accounting, and lazy accounting,
# > > where the strict model doesn't allow anyone to allocate more RAM than you
# > > have, and the lazy model allows you to ask for whatever you want, just so
# > > long as you don't use it all. This sort of thing probably has to be a
# > > system-wide policy decision, though there was a suggestion that it could be
# > > per-process, provided that a process which requested lazy allocation would
# > > be the first to die in a crunch.
# > 
# > Mach VM would invariably hang if something allocated all the RAM+swap
# > in the system and still wanted more.  UVM in 1.4 deals with this by
# > killing anyone who wants to allocate another swap-backed page when
# > RAM+swap is full, on the theory that the process that allocates the
# > next page is the one most likely to be causing the shortage.
# > 
# > as for eager vs. lazy swap allocation, I agree that making it globally
# > selectable is a good plan.  we should probably have the system come up
# > in eager mode, with a one-way sysctl to switch to lazy mode.  this is
# > not very near the top of my todo list (or chuck c's either, I think),
# > so it'll be a long time in coming unless someone else steps up to do the
# > work.  hint, hint :-)
# > 
# > -Chuck
# > 
# 
# 


				--*greywolf;
--
NetBSD: No Sh;t!