tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

File system corruption due to UFS2 extended attributes



The introduction in NetBSD's implementation of UFS2 of the extended
attribute code from FreeBSD has introduced a compatibility problem
with previous releases of NetBSD.  The explanation of this problem is
a bit involved and requires knowing some history, so please bear with me
as I explain.

On 2002-06-20, initial support for UFS2 was added to FreeBSD in svn r98542
(now git 1c85e6a35d93195e896b030d9a55f7ac4ccee2c3).  In this version of
the code, the on-disk field for extended attributes (di_extb[] in
struct dinode) was present but unused.  This field in the dinode was
initialized to zero and never accessed after that.

On 2002-07-19, the FreeBSD UFS2 kernel code was changed in svn r100344
(now git 7aca6291e3fb803b6563ce6e39680da3a2ba0feb) to actually use the
di_extb[] field to store pointers to blocks containing extended attributes.

On 2002-09-23, support was added in FreeBSD fsck_ffs in svn r103885
(now git c18ef4c018881d7e39b27df5bf7b65b4d58cac6b) to consider the blocks
referenced by di_extb[] as allocated, and thus to refrain from freeing
any blocks which were in use for extended attribute data.

All of the above changes were made during the FreeBSD 5 development cycle,
so FreeBSD 4.x did not have UFS2 at all and FreeBSD 5.0 had full support
in both the kernel and fsck_ffs for UFS2 with extended attributes.

Over in NetBSD-land, on 2003-04-02 a version of UFS2 was imported which
was based on the original 2002-06-20 version from FreeBSD, and did not
include the later changes for extended attributes, with a note in the
commit message that the extended attribute support would be added "later".
This version of UFS2 was first released in NetBSD 2, and it did not
include any support for extended attributes.

Fast-forward to 2020, when the UFS2 code in NetBSD was finally enhanced
to support extended attributes.  This support will be first released
in NetBSD 10.

Unlike in FreeBSD where the first release of UFS2 included all of the
support for extended attributes, in NetBSD there have been many releases
which support UFS2 but without any knowledge of UFS2 extended attributes.
This is where we have a problem.

If a UFS2 file system is mounted under NetBSD 10 and an extended attribute
(such as an ACL) is stored on a file, and then that file system is transported
to a NetBSD 9 system, fsck_ffs on the Netbsd 9 system will report that the
file system is corrupted because the extended attribute block is marked as
allocated, but this block is not referenced in any way that NetBSD 9 knows
about, and thus fsck_ffs will repair this corruption by marking that
extended attribute block as free, but it will also not clear the file's
di_extb[] field, because that field is not used in NetBSD 9.  NetBSD 9
now thinks that the file system is ready to use, and the file system can be
mounted and used normally.  NetBSD 10 on the other hand would think that
this file system was fine before the NetBSD 9 fsck_ffs was run, and after
the NetBSD 9 fsck_ffs has been run, the NetBSD 10 fsck_ffs would say that
the file system is now corrupted, because the blocks that the di_extb[]
field points to are marked as free.

Because the extended attribute block is now free, it can be allocated
to a different inode for a different purpose, eg. to a regular file as
a data block.  If this file system is then transported back to on a
NetBSD 10 system, the file system will be mountable without running fsck
because the NetBSD 9 kernel will mark the file system clean during unmount,
and now the NetBSD 10 kernel will try to process the new contents of the
dup block as both application data and extended attribute metadata.

The NetBSD 10 fsck_ffs will report that the file system is corrupted again
because extended attribute block which was freed and then allocated again
is now referenced by multiple inodes (di_extb[] from the original inode
and eg. di_db[] from a different inode in this example).  NetBSD 10 fsck_ffs
will then repair this dup-blocks corruption by zeroing both inodes
that refer to this block.


So what can we do about this?  There aren't any really great options.
But the only change which will guarantee that all old NetBSD releases
(which do not know about extend attributes) will not corrupt file system
images where extended attributes have been stored is to create a new variant of
UFS2 with a different magic number (the "fs_magic" field in the superblock).
This is what I propose to do.  I spoke with Kirk McKusick about this problem
and he agreed that creating a new UFS2 variant with a different magic number
is the best way to deal with this situation.

This new UFS2 variant (which I'm calling "UFS2ea") will be different
from the existing NetBSD UFS2 as shipped in previous NetBSD release
only in that it will add support for extended attributes, and that
it will only be supported by NetBSD 10 and later.  In all other respects,
UFS2ea will be the same as NetBSD's existing UFS2.

The user-interface changes for UFS2ea will include:

 - newfs will accept a new option "-O2ea" to specify creating a UFS2
   file system with extended attribute support.  "-O2" will continue to
   create UFS2 images without support for extended attributes, which will be
   compatible with NetBSD 9.

 - fsck will take a new option "-c ea" to specify that an existing UFS2
   file system should be converted to support extended attributes
   (ie. converted to UFS2ea).  This conversion first clears all of the on-disk
   pointers to extended attribute blocks (the inode "di_extb" field),
   since in NetBSD releases prior to NetBSD 10, those pointers could only
   have been set to non-zero values by corruption in the file system.
   Because clearing the on-disk pointers to extended attributes could be
   effectively removing ACLs that had been set while we were mistakenly
   allowing extended attributes in non-ea UFS2 file systems, any files in which
   these pointers were not already zero will have the permission bits in the
   mode field cleared as well, so that enabling extended attributes does not
   accidentally result in anyone gaining access to the file that they did not
   have before.

 - dumpfs will report a UFS2ea file system as:
	format  FFSv2ea
   rather than
	format  FFSv2

 - makefs will take new options "-e 1" and "-o extattr=1" to specify that
   it should creating a UFS2 file system of the UFS2ea variant, with support
   for extended attributes.

 - mounting a non-ea UFS2 file system will now behave the same as UFS2
   did in previous NetBSD releases... all extended attribute operations
   will fail with EOPNOTSUPP.


I have put a patch implementing all of the above at:

https://ftp.netbsd.org/pub/NetBSD/misc/chs/diff.ufs2ea.1


One remaining issue is that for the last two years, NetBSD-current has
allowed creation of extended attributes in existing UFS2 file systems,
and the above patch will implicitly delete all extended attributes in
those existings UFS2 file systems.  To help anyone who might have
created UFS2 ACLs or other extended attributes during this time that
NetBSD-current allowed creating extended attributes in file systems
with the original UFS2 magic number, I wrote a small program that
just changes the magic number in the primary superblock and all
alternates, but leaves everything else alone.  The source for that
utility is here:

http://ftp.netbsd.org/pub/NetBSD/misc/chs/ufs2ea-flip.c

I don't want to make this one-off tool part of any NetBSD release,
since the only reason to use this tool is if you have been using
extended attributes in -current prior to when extended attributes
will be restricted to UFS2ea file systems.  Anyone who has only
ever run releases and not -current will not need this tool,
and it's really a bad idea to use this tool if you don't need to.


Comments about all this?

-Chuck


Home | Main Index | Thread Index | Old Index