Subject: Re: FFS reliability problems
To: NetBSD Kernel Technical Discussion List <tech-kern@netbsd.org>
From: Greg A. Woods <woods@weird.com>
List: tech-kern
Date: 06/12/2002 16:17:35
[ On Monday, June 10, 2002 at 19:51:00 (+0700), Robert Elz wrote: ]
> Subject: Re: FFS reliability problems 
>
>     Date:        Fri,  7 Jun 2002 13:19:22 -0400 (EDT)
>     From:        woods@weird.com (Greg A. Woods)
>     Message-ID:  <20020607171922.138D1AC@proven.weird.com>
> 
>   | The application is assuming the system will continue running smoothly
>   | until it does what it does with the data and closes the file itself
>   | (perhaps by exiting, cleanly or otherwise).
> 
> Rubbish.   The whole point of the create()/unlink() maneuver is to handle
> the case where things don't continue running smoothly in a semi-reasonable
> way.   If the application is to assume that it will exit properly, it
> can easily unlink its temp files on the way out, that's so boringly
> trivial to do I won't bother telling you how...

You might think that's the point, but it's clearly and obviously not
what real world applications do -- most real-world uses of this
"feature" are for the purpose I stated, and that purpose alone.  Yes,
it's a silly hack and it's no longer necessary and it can be misused in
many broken ways, but that's life.

>   | Applications do this in
>   | order to implement a trivial garbage collection algorithm
> 
> Yes, to handle the abnormal termination case.

No, not always just for "abnormal termination" -- often for all cleanup.

>   | -- but that
>   | doesn't mean the data they write is garbage right from the start.
> 
> No, and that's not what I said - what I said was "useless after the
> application has vanished" (or some works like that).   If the application
> is unlinking temp files that would still be useful after the application
> has died in some abnormal way, including system crashes, then the
> application is broken.

A system crash is not an instance of the application exiting, not for
any reason.  If the system had not crashed the application could well
have read the "temporary" data back out of the unlinked file and written
it safely to a "linked" file.  The application does not expect the
system to crash.  The user does not expect the system to crash.  If the
system crashed a few seconds earlier or later the issue might never have
occured, but due to bad timing the system crash has caused the
application to lose data.  This is not proper behaviour of a production
quality system.  The system must endeavour to recover any recoverable
data, regardless of whether the application could have been better
engineered so as to not have gotten into this situation in the first
place.

>   | That data is recoverable.
> 
> Not always.   Consider a temp file that has pieces of some random
> data, in binary, with the index that puts it all back together
> left in memory (if you like, think of an ed temp file, rather than
> a vi temp file, when the -x flag has been used).

Your claim is totally irrelevant and meaningless.

No systems programmer has any right to make such a declaration.  If
that's the way you feel about it then your only proper course of action
is to make it illegal for an application to successfully unlink() an
open file.  However until/unless that change is made then only the user
and/or the application has the right to declare whether or not the data
in such a file may be discarded or not.

> If the data is recoverable, the application shouldn't be unlinking it,
> because ...

Applications create and unlink temporary files before they write any
data to them.  The data they write to such files may be vitally
important and unique.  They do this.  really.  In real life.  Like it or
not.  They do it because they can and because once upon a time they
"had to" if they didn't want to incur much additional complexity
(i.e. before atexit() was available).

>   | Fsck has no business deleting it -- none whatsoever.
> 
> Of course it does.

No, it does not.  None at all.

>   Consider the other case, probably the more common
> case where it is the application that aborts for some reason. 

In that case the system is off the hook.  The application has suffered
its own catastrophic failure and has exited.  The data in the temporary
file was not lost due to a catastrophic system failure.

>  The same
> unlinked temp file was open when the app died (SEGV'd or whatever), are
> you now going to claim that the kernel has no business deleting it,
> "none whatsoever" ??? 

I've already very explicitly said not several times!

>  The situation is just the same, the kernel knows
> that the application aborted, as fsck can infer that the system crashed.
> The kernel knows even better than fsck does that the file in question
> was one which was open, but had been unlinked.
> 
> So, should the kernel be taking such files and linking them into
> lost+found instead of deleting them?   By your argument, that's the only
> conclusion I think, yet it would be absurd.

I'm OK with that, and I'm sure greywolf will agree -- though perhaps in
this case lost+found is the wrong place since the kernel could also
remember exactly where the file was created in the first place.

Maybe someday applications will all be converted to use atexit() to
register their cleanup routines and this silly garbage collection
technique will no longer be necessary at which time the unlinking of an
open file can be disallowed.  Maybe that should happen immediately and
the issue should be forced.

All I know is that the system must not throw away data that can be
recovered when its own failure would be directly responsible for loss of
that recoverable data.

>   | Even the most juniour sysadmin can trivially clean it up
>   | after the crash, but only if given the chance.
> 
> Huh?   Aside from the "I can rm /filesys/lost+found/*" trivial
> solution (which is no different than having fsck do it in the first
> place, except it also destroys files fsck wouldn't have removed)
> how is your junior sysadmin supposed to figure out what files are
> worth preserving, and which aren't?

By asking the user they belong to, of course.  Even the most junior
sysadmin can figure out who a file belongs to.

> Then the application that does this (when the file would make sense
> recovering) should be fixed, otherwise you certainly lose when the
> application dies (kill -9 aimed at the wrong pid by accident, or
> whatever...)

Of course -- but that's irrelevant.  The system didn't crash and so it's
not responsible for trying to recover data lost because of the crash.

>   | Many _many_ applications create and then _immediately_ unlink
>   | temporary files that they will later use to shuffle data around.  They
>   | do so to make cleanup easy, not to say "the data I write here is trash".
> 
> Of course, but almost none of those applications realistically expect
> the data in those temp files to be of any use if the app dies or the
> system crashes - it is only useful while the app continues running.

You cannot make such a declaration because in doing so you can only be
wrong.  You did not write all those applications, and you did not run
them all, and therefore you cannot make claims about the intentions of
the application authors, nor about the intentions of the users running
those applications.

-- 
								Greg A. Woods

+1 416 218-0098;  <gwoods@acm.org>;  <g.a.woods@ieee.org>;  <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>