Subject: Re: dump for MS-DOS partitions.
To: Robert Nordier <rnordier@iafrica.com>
From: Terry Lambert <terry@lambert.org>
List: port-i386
Date: 03/25/1997 14:29:16
> > A fsck is relatively trivial.
> > 
> > That's because there is no difference between a directory entry and
> > a physical inode in the MSDOSFS... many of the checks performed by
> > the FFS fsck are simply not applicable to the idea of checking an
> > MSDOSFS.
> 
> That a fsck-like utility for FAT/VFAT is relatively trivial, feasible,
> or even desirable, is a dangerous illusion. :-)
> 
> What makes fsck itself possible is that the FFS was modified to make
> recovery (by fsck) a deterministic process.  If processing is
> interrupted, fsck needs only enough smarts to know what the FFS was
> busy with, and therefore what must be done, or undone.

Yes; it is because the FFS operation guarantees to order operations:
not that it gurantees only a single operation, but that it guarantees
a single idempotent state change will have occurred.

Of course, it is the responsibility of the MSDOSFS to make similar
guarantees.  Or any FS, LFS or EXT2FS, or otherwise, for the same
reasons.

The CHKDSK utility on DOS, which is deemed "sufficient" for recovering
DOS FS's during a failure, even in the Win95 case, where the FS data
is cached, only recovers the cases where the block references are
corrupted.

VFAT.VXD in Windows95 guarantees that operations on directory
entries won't take place until the directory entires are committed
to disk, and similarly, that operations on data for a deleted item
will be committed before the item is deleted.  The VFATFS can have
no knowledge of a file seperate from the directory entry -- there is
no such thing as allowing an unlink of a file that is open, in VFAT.


> A true fsck doesn't need to `know' the filesystem *as data*.  But it
> needs a near perfect knowledge of the filesystem *as code*.  Fsck
> doesn't really look for broken data structures and repair them, it
> identifies interrupted updates and completes them (rolling them back or
> forward).

More accurately, it needs a directed graph of the transaction events
which can occur as idempotent transactions so it can determine what
node-to-node graph state traversal operation was in process at the
time of the crash.


> A fsck needs to be paired with a particular FS implementation,
> because it is (logically) an integral part of a *specific* FS
> implementation.

Yes.  In many cases, it would be correct to implement the fsck in the
mount, if:

o	The state were deterministic (don't allow unordered transaction
	state changes -- don't allow -async mount in the FFS case).

o	If there was not the possibility of external FS state change
	not occuring as the result of an FS transaction, either through
	the operation of another OS against your FS image, or, more
	commonly, as the result of various kinds of media failure.

The reason the FFS fsck remains seperate is largely because of the
possibility of meia failure, where recovering the FS to a known state
becomes more important than recovering it to the correct state, since
the correct state is physically unrecoverable (media failures are seldom
ordered, idempotent operations 8-)).


> With the DOS FS(es), the situation is too different.
> 
> Even if the dozen or so DOS (or DOS FS) implementations all did
> metadata updates ordered the same way, these good intentions
> would still potentially be perverted by caching software/subsystems
> that don't provide (or are not configured for) `write through'
> operation.

Yes.  This dictates what you can and can't cache.  This argues against
taking, at random, anyone's fsck implementation, unless the implementation
knows the minimal idempotence of operations for a given structure, and
assumes the operations will be correctly ordered.

s still leaves things like "delete pending" operations which are applied
to the in core vnode, but which have not been propagated to the underlying
fs, because of the FS being stateless (NFS file rename hack) or because
of the FS being unable to seperate the concepts of file container (inode)
from file reference (directory entry)... like VFATFS.


> In addition, the DOS FS lacks a `clean' flag, so FS repair is not
> forced after a crash.  By the time FS repair *is* attempted, there
> may have been multiple interrupted updates, undetected, each of which
> left FS inconsistencies, which then interacted to produce further
> inconsistencies....

The answer is to force the repair each time.

You can implement an FS-based tag object (say the archive bit on a hidden
file) to get "clean flag" behaviour, but this is not common across all
implementations, so it can't really be trusted unless you are guaranteed
that the last one to use the drive was you or a protected mode MS OS
that didn't allow raw volume access without acquiring a volume lock.
This is a reasonable guarantee to make in most cases, assuming a drive
shared only between BSD and Windows 95 or Windows NT.  Otherwise, the
check *should* be done each time to validate the shutdown state.


> Another problem is that a bug in any application can unintentionally
> modify the DOS filesystem code itself, or corrupt system tables.  So
> however perfect the DOS FS implementation may be, its correct operation
> can't be assumed.

Not in a protected mode OS, unless it's a system level program that
acquires a level 3 volume lock then a level 0 volume lock.  On the
non-protected mode MS OS's, we,, you are on your own.


> Any kind of deterministic fsck for the DOS FS is therefore a pipe dream
> (except if only the BSD DOSFS implementation is ever allowed to update
> the filesystem ... not a realistic restriction, given why anyone is
> likely to be using a DOS FS in the first place).

Or an MS OS, which makes ordering and protection guarantees.  Like
Windows 95 and NT do.


> A DOS FS repair utility has to be heuristic.  But to represent such a
> utility as fsck-like, makes false claims.  A heuristic utility
> functions completely differently; and a heuristic utility hasn't a
> remotely comparable chances of success.

Agreed.  It's isn't a guarantee of *the* correct state, only one
of *a* consistent state.  A consistent state is enough to keep the
protected mode FS driver from faulting, though.


> Fsck also provides a very bad model for what a heuristic file repair
> utility should be like.  When something has to be done, fsck knows
> what it is doing: so it needs a minimum of interaction with the user.

Taht is a *ggod* model.  Every place you allow the user a decision
allows the user to make the wrong one.  Throw out the user, and the
computer will be much happier overall.


>    o A utility of a goal-seeking AI-type (not unlike a chess program)
>      which can run a million `what if' scenarios before deciding,
>      in the case of a cross-linked cluster, for example, which link
>      to preserve.

The goal is "a consistent state", not "the consistent state".  As a
result, some simple hueristics, like "break circular cluster chains",
"duplicate shared cluster chains to produce files with identical chain
component content", etc., is enough to guarantee *a* consistent state.

> Where one or more directories link to the same cluster, it may be
> impossible to resolve the situation sensibly.

Duplicate the chain if you have space, pick one by date if you don't.
You can allow the user to specify policy for the application of the
heuristic, but not allow them interaction with it (at least in the
default case).


> Asking the user only puts him in a maze of twisty little decision
> paths, all different; an arbitrary decision risks destroying
> nearly 100% of the filesystem; and an exhaustive, recursive
> analysis of the consequences is likely to take longer than the user
> (and/or the universe) is prepared to wait.

So don't ask.  Do.


> > In the second case, it asks "convert cluster chains to files?", and
> > makes files to contain the chains.  This, also, can never happen
> > during normal operation.
> 
> If directories are involved, this can also totally scramble the
> filesystem.

If it is a directory, it's obvious.  If it isn't, then it's also
obvious.  You have to ask "what conditions could cause this error"
and not allow the conditions to occur in that sequence in toyr VFATFS
(just as DOS does not allow the conditions to occur -- generally,
reuse of a cache blcok written out of time order is the only possible
case where this could occur).


> What I think the DOS FS needs is a sort of `lint'.  I've been working
> on something that even offers optional advice like ``Warning: cross-
> linked directories exist: don't even think of running scandisk''. :-)
> Being lint-like, it only finds problems, it doesn't fix them.

Alternately, if you had the hidden file you were using for the "clean"
bit, you could store a "these files are inaccessable until the user fixes
them" list.



					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.