Subject: Re: FFS reliability problems
To: Robert Elz <kre@munnari.OZ.AU>
From: Greywolf <greywolf@starwolf.com>
List: tech-kern
Date: 05/20/2002 09:31:43
On Mon, 20 May 2002, Robert Elz wrote:

#   | I've now done the tweaks to fsck_ffs; I added -z which tells it that
#   | when it finds a file with zero link count but non-zero size, it should
#   | link it into lost+found instead of torching it.
#
# Is that really going to help?   You're either going to have to use the
# option every time (in which case, why is it optional?) or it will be too
# late, the standard fsck after reboot will already have cleaned up all this
# junk.

not necessarily.  I wouldn't choose to have it enabled all the time,
by any means.  Chances are that if it crashes in the midst of something
like that, I'm going to be the one pushing the reset button, so I'm
going to be sitting at the console watching as fsck deletes the files.

That said, however...

# If you leave it enabled all the time, I suspect you'll get inundated
# with all the junk files that you really didn't want to reappear being
# relinked after every unclean shutdown.
#
# Much better would be to simply fix the broken application that is leaving
# some kind of important data in unlinked files.  That's insane (next someone
# will be asking for a way to recover files from an MFS /tmp if the system
# crashes when something they considered important was sitting there...)

Well, that's a bit extreme, but your point is taken.  Usually, I find
that the GIMP is actually very good about things.  The unfortunate turn
of events was:

	- I hit save
	- The system wrote the inode at size 0 (link count 1)
	- The system panicked
	- I rebooted
	- ...and watched as all the (presumably temporary) files
	  associated with the project got vapourised, since the link
	  count was 0, and they were scheduled for deletion anyway

[more likely, the strategy used was to create a temp file and unlink it
 once it was open; this is, in effect, an automatic cleanup strategy,
 since once the last close() happens, the file goes away, and if one
 SIGKILLs the process, one does not have to worry about cleaning up the
 wreckage.  This can also be seen as a disadvantage, depending on how into
 sifting through the debris one is. ]

I think the _problem_ with what I complained about was my lack of under-
standing as to why files which have length are not reconnected.  The "if
the link count is zero" is the piece I was missing.  I have received what
I interpret as a valuable piece of education.  (We are always learning;
when you stop learning, you die.  But I digress.)

I consider the system, in this case, still to have taken the correct,
if undesirable, course of action, but I would like to have been given the
opportunity to alter that course.  Even had further attempts at recovery
been fruitless, at least I would have been given a chance.

# kre


				--*greywolf;
--
NetBSD: The Final Frontier.