[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Hanging at shutdown with mystery "file system full" error
On Thu, 22 Oct 2015 05:34:02 +0700 Robert Elz <kre%munnari.OZ.AU@localhost> wrote:
> Date: Wed, 21 Oct 2015 23:07:53 +0200
> From: "Ian D. Leroux" <idleroux%fastmail.fm@localhost>
> Message-ID: <20151021230753.d8eab7389a3df445751e1987%fastmail.fm@localhost>
> | umount -f /dev
> | Causes an immediate hang with the same kernel error message
> | (..., on /var: file system full), attributed this time to the
> | login command.
> Not sure exactly why the full message, but in a way that makes sense,
> forcibly unmounting will invalidate any vnodes from the filesystem,
> which would include those for the mount devices for all of the
> (normal) filesystems - there'd be no way to access their vops for
> performing any i/o.
That makes sense. Thanks for the explanation.
> | The options as I currently see them are:
> | Maybe the hang has to do with unmounting a *busy* tmpfs
> | filesystem?
> You'd be able to test/provoke that by just opening a file on /tmp and
> then umount -f'ing it. I doubt there'd be a problem.
You'd be right. I just tried 'umount -f'ing both /tmp and /var/shm
while they were busy (either because a shell had its cwd inside /tmp or
because less had a file on /var/shm open) and no problems at all.
> | Would it make sense to replace "umount -aft tmpfs" with
> | "umount -at tmpfs", so as to remove only filesystems that aren't
> | busy?
> No, other busy tmpfs's (such as a command hung waiting on some device
> that isn't responding with its cwd there) wouldn't get unmounted.
> Not unmounting tnpfs at all wouldn't be so bad (their state after
> shutdown is hardly important!) but that could prevent other
> filesystems being unmounted (a mounted /var/shm would block a normal
> unmount of /var for example).
That last part I don't understand, or rather I don't understand
why /etc/rc.d/swap1 should worry about that case. I would have
thought that any subsidiary mount point, tmpfs or not, would prevent an
unmount of (say) /var. If the rc system is meant to unmount all
filesystems at shutdown time, then it already needs some logic to do so
in the right order. How are tmpfs filesystems special in this regard?
My understanding of the not-so-recent change to /etc/rc.d/swap1 is that
it is meant to ensure that there is swap space left for swapctl to
function. That seems to me to be orthogonal to the question of
ensuring that all filesystems can be unmounted.
> For what it's worth I saw a shutdown hang on a fairly recent current
> recently - but it was much later than you're describing, after the
> "syncing disks 3 2 1 ..." message (never got to "Done" after 1). I
> waited a while (several minutes at least) and it just sat there - till
> it needed to be powered off [that's why it was being shut down.]
> (Have not yet restarted that system to see if there were any
> uncleaned filesystems left over.)
> That system has the "normal" tmpfs's (/tmp and /var/shm) neither of
> which would have been busy - what's more, /etc/rc.shutdown had
> completed I believe, so they should have been unmounted well before.
That sounds to me like there are several possibly-unrelated shutdown
hangs, which may need to be solved independently. The one I'm seeing
seems to be a simple matter of /dev being unmounted before userland is
done, which I think we agree is not a sensible thing to be attempting.
To the extent that read-only-/ with tmpfs-/dev is a reasonable
configuration (and at least Joerg and I seem to think so),
then /etc/rc.d/swap1 needs a more precise notion of which filesystems
it can safely unmount to make space on swap.
On systems like mine, which have way more swap than they need, it seems
practical just to omit the 'umount -aft tmpfs' from swap1_stop(), but
that doesn't address the problem that the change was initially meant to
Main Index |
Thread Index |