[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: unconfiguring swap at shutdown
Tmpfs? Do this after file systems have been unmounted?
On Sep 2, 2008, at 5:57 PM, Daniel Carosone <dan%geek.com.au@localhost> wrote:
On Tue, Sep 02, 2008 at 03:52:20PM -0700, Jason Thorpe wrote:
A lot of the unkillable processes I've seen are stuck deep inside
device driver, waiting for an even that either will never happen or
which could happen but which is difficult to arrange for.
Yes, they're waiting for an event... using some facility provided
kernel... condvars or tsleep... meaning the kernel could awaken the
thread and cause it to commit suicide.
There are other cases of interest for a forced removal/invalidation of
swap pages too, that may favour the page-invalidation approach rather
than the process-killing approach. Those cases aren't only at
Pages owned by something-other-than-a-process (the tmpfs example) is
one that's come up already.
Another would be a failing/failed/removed/etc swap device. Depending
on details, this currently would lead to a panic or processes blocked
forever on a failing pagein (I expect). This could lead to exactly
the kind of shutdown scenario discussed above, as well as problems in
general operation. It Might Be Nice to let the system try and proceed
instead, invalidating pages that can't be recovered, killing processes
if need be as a result.
There's another case in the other direction, but it hits some of the
same kinds of error paths when paging. Ideally, when suspending a
machine with cgd(4), we should flush the keys from memory, and the
device should block new requests until the key is reloaded after
resume. On such a machine, swap is clearly one of the things likely to
be inside the cgd. We need to arrange for cgdconfig(8) and whatever
else we need to reload the key to be locked in ram before suspend,
sure, and there are ways to do that now. Having support for marking a
swap device as suspended (so the system can do something smarter than
just pile up paging requests in the disk queue) seems like it might be
Doing "hibernate" support via process swapout and a small kernel state
blob will probably raise some other cases.
Is it worth catering in detail for these cases? I'm not sure, but as
long as we're hypothesising about "smarter swapctl -d" they're worth
raising for consideration.
If it's not worth it, it is enough to have a knob that can be turned
to avoid a hang trying to detach swap when shutting down in such
circumstances. Remembering to turn that knob is another matter, maybe
some of the cases above should automatically set it if we know they're
going to lead to trouble.
Main Index |
Thread Index |