Subject: Re: FFS byteswapping
To: None <tech-kern@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-kern
Date: 10/24/2001 23:31:33
>> Am I hallucinating, or is there still a bug here?  ISTM that
>> ufs_sb_swap should be caching pre-swap copies of cpc and nrpos same
>> as it does postbloff and postblfmt.

Actually, when I went to fix this, I noticed that it can be fixed
perhaps more simply by just moving the computation of n16, o16, and len
up to before the byteswapping starts.

Here's what I'm using for ffs_sb_swap now.  (Rather than rework the
rest of the code that calls it, I left the third argument in.  For use
with -current, of course, it would need the needswap variable and
fs_magic checking code.)

void
ffs_sb_swap(o, n, ns)
	struct fs *o, *n;
	int ns;
{
	int i, len;
	u_int32_t *o32, *n32;
	u_int16_t *o16, *n16;
	u_int32_t postbloff, postblfmt;

	postbloff = ufs_rw32(o->fs_postbloff, ns);
	postblfmt = ufs_rw32(o->fs_postblformat, ns);
	/* Compute these before swapping, in case o==n. */
	o16 = (postblfmt == FS_42POSTBLFMT) ? o->fs_opostbl[0] :
	    (int16_t *)((u_int8_t *)o + postbloff);
	n16 = (postblfmt == FS_42POSTBLFMT) ? n->fs_opostbl[0] :
	    (int16_t *)((u_int8_t *)n + postbloff);
	len = (postblfmt == FS_42POSTBLFMT) ?
	    128 /* fs_opostbl[16][8] */ :
	    ufs_rw32(o->fs_cpc, ns) * ufs_rw32(o->fs_nrpos, ns);
	/* in order to avoid a lot of lines, as the first 52 fields of
	 * the superblock are u_int32_t, we loop here to convert it.
	 */
	o32 = (u_int32_t *)o;
	n32 = (u_int32_t *)n;
	for (i=0; i< 52; i++)
		n32[i] = bswap32(o32[i]);
   
	n->fs_cpc = bswap32(o->fs_cpc);
	n->fs_fscktime = bswap32(o->fs_fscktime);
	n->fs_contigsumsize = bswap32(o->fs_contigsumsize);
	n->fs_maxsymlinklen = bswap32(o->fs_maxsymlinklen);
	n->fs_inodefmt = bswap32(o->fs_inodefmt);
	n->fs_maxfilesize = bswap64(o->fs_maxfilesize);
	n->fs_qbmask = bswap64(o->fs_qbmask);
	n->fs_qfmask = bswap64(o->fs_qfmask);
	n->fs_state = bswap32(o->fs_state);
	n->fs_postblformat = bswap32(o->fs_postblformat);
	n->fs_nrpos = bswap32(o->fs_nrpos);
	n->fs_postbloff = bswap32(o->fs_postbloff);
	n->fs_rotbloff = bswap32(o->fs_rotbloff);
	n->fs_magic = bswap32(o->fs_magic);
	/* byteswap the postbl */
	for (i = 0; i < len; i++)
		n16[i] = bswap16(o16[i]);
}

> [...] I didn't specifically notice a problem in the various fsck_ffs
> endian tests I did [...]

Yes, I wondered why this hadn't bitten anyone.  It can't be just a
matter of postblformat normally be FS_42POSTBLFMT; the filesystem that
cored fsck_ffs on me wouldn't've done that then, and I took no special
action when setting it up.  And cpc and nrpos are normally relatively
small integers, so getting them swapped means the multipliction will
overflow and len will come out zero, meaning the postbl wouldn't get
swapped.  I would expect this to be a problem, but as I don't really
have my head around a lot of the FFS data structures, I can only shrug
and say "I guess not".

> If you updated to a -current-ish fsck_ffs & ffs_bswap.c, would you be
> able to test fsck_ffs on the filesystem that caused the coredump, and
> then test the following patch and see if that improves things?

An fsck_ffs built with untouched sources except for the ffs_sb_swap I
quoted above breezes right past the place where it used to coredump.
It finds lots of stuff wrong with the filesystem, but since fsck finds
the same problems when I copy it to a machine where it's native-endian
and the filesystem came from a seriously sick OS install, I think the
filesystem really is damaged and fsck_ffs is fine.

> Index: ffs_bswap.c
> ===================================================================
> RCS file: /cvsroot/syssrc/sys/ufs/ffs/ffs_bswap.c,v
> retrieving revision 1.13
> diff -p -u -r1.13 ffs_bswap.c
> --- ffs_bswap.c	2001/09/06 02:16:02	1.13
> +++ ffs_bswap.c	2001/10/25 02:42:58
[...]

This looks like an intermediate stage my ffs_sb_swap passed through,
before I noticed that simply moving the computation of o16/n16/len
would solve it without needing extra variables.  I would expect it to
work just fine.  (You're also welcome to swipe the one above, either
conceptually or, mutatis mutandis, textually, if you think it's a
better tack to take.)

This does seem to me to raise an issue, though: if fs_postbloff, but
nothing else, is garbaged, fsck_ffs can coredump.  ISTM fsck_ffs should
_never_ coredump due to filesystem damage.  (Even if you agree
philosophically, it's not clear this is actually worth worrying about.)

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B