Subject: Re: ffs compat problem since ffs2
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Luke Mewburn <lukem@NetBSD.org>
List: tech-kern
Date: 09/26/2003 09:45:36
On Fri, Sep 26, 2003 at 01:06:25AM +0200, Manuel Bouyer wrote:
  | On Fri, Sep 26, 2003 at 08:35:30AM +1000, Luke Mewburn wrote:
  | > Recently I experienced some lossage with a new kernel on an older
  | > -current system (looked like FS corruption), and upon further
  | 
  | Was it current before or after ufs2 integration ?

Before.  (Kernel was dated March 19; AFAICT, UFS2 went in early April).


  | What kind of fs corruption ? At kernel or fsck level ?

First boot; kernel couldn't find /sbin/init or /dev/console
(errno = 20, ENOTDIR)

Second boot; kernel accessed /, but accessing /tmp (an mfs
created with a March 19 mount_mfs) caused a panic.


  | >    FS_FLAGS_UPDATED
  | > 	sys/ufs/ffs/ffs_vfsops.c
  | > 		set in fs_old_flags
  | > 		tested in fs_old_flags
  | > 		fs_flags gets all fs_old_flags except FS_FLAGS_UPDATED
  | > 	sys/ufs/ffs/fs.h
  | > 		defined
  | > 	sbin/fsck_ffs/utilities.c
  | > 		tested in fs_old_flags
  | 
  | This shouldn't matter; it'll always get the old values, but current
  | make sure the old and new fields will match.
  | 
  | > 	usr.sbin/dumpfs/dumpfs.c
  | > 		tested in fs_old_flags
  | 
  | Same comment as for fsck

Well, dumpfs returns the incorrect values for my file systems, saying
that they're 4.2/4.3BSD because "printold" is true due to the missing
FS_FLAGS_UPDATED...

Note that even if we tweak the tests in fsck_ffs & dumpfs to ignore
FS_FLAGS_UPDATED and use another method of testing that, it's possible
that dumpfs will still return incorrect results for a file system
created on an old system that has not yet had its superblock updated
by a new kernel...


  | > Given that fsck_ffs & dumpfs currently test FS_FLAGS_UPDATED,
  | 
  | They do but shouldn't.
  | At the current state, it should't matter if FS_FLAGS_UPDATED is set or not,
  | current will make sure the old and new field matches.
  | 
  | I'm not sure why you ran into this problem; but I don't think the lack of
  | FS_FLAGS_UPDATED on disk could cause it. It would be good to be able to
  | investigate further.

Look at how FS_FLAGS_UPDATED is used by these tools.  It should be
apparent why dumpfs gets confused.


  | >     b)	Implement a new macro, something like:
  | > 
  | > 		#define FS_UPDATED_UFS1(fs) \
  | > 			(((fs)->fs_magic == FS_UFS1_MAGIC)
  | > 			&& ((fs)->fs_maxbsize == (fs)->fs_bsize))
  | > 
  | > 	Use that instead of "(foo.fs_old_flags & FS_FLAGS_UPDATED) == 0"
  | > 
  | > 	Deprecate FS_FLAGS_UPDATED except as a place-holder (maybe
  | > 	rename it in fs.h ?)
  | 
  | We need to do that anyway.
  | FS_FLAGS_UPDATED will still be set, in case new flags are used, so we can't
  | deprecate it.

FS_FLAGS_UPDATED will NOT be set in fs_old_flags that's written back
to the superblock, and that's what fsck_ffs and dumpfs are checking.

FS_FLAGS_UPDATED is also not copied from fs_old_flags to fs_flags,
so we can't check fs_flags in those tools for the bit being set either.

So, basically, FS_FLAGS_UPDATED is a flag that is "useless" to
userland at this point.  Useless in that there's no point setting it
in makefs / newfs, an useless in testing it since the kernel doesn't
let userland "see" it in fs_old_flags (because of the FS_SWAPPED bit
clash) or in fs_flags (for unknown reasons).



Luke.


<side rant>

BTW: I suggest that next time you check where these type of flags are 
used throughout the tree (and not just under src/sys) before making
these type of kernel changes...  If this comes across too "harsh",
I apologise, but I'm annoyed at this apparent lack of due diligence
(by checking the entire source tree for how this flag was used).

I've been screwed multiple times because of apparent lack of care
taken in this ffs2 migration, and I still maintain that it should
have been done as a separate file system,  both in the original
implementation in FreeBSD, and the import into NetBSD.  When people
lose data in ffs1 file systems because of ffs2 import lossage, that is
unacceptable.  When that person is me, it's even worse :/ :)

We are "NetBSD" and one of the features I always admired about our
project was our committent to not losing (file system) data,
_unlike_ my early experiences with Linux (and from what I can tell,
current "leading edge" filesystems in Linux as well).  It is
unfortunate that the ffs2 implementation has meant that I can't
honestly believe that about NetBSD any more... :-(