Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: one remaining mystery about the FreeBSD domU failure on NetBSD XEN3_DOM0



At Fri, 16 Apr 2021 11:44:08 +0100, David Brownlee <abs%netbsd.org@localhost> wrote:
Subject: Re: one remaining mystery about the FreeBSD domU failure on NetBSD XEN3_DOM0
>
> On Fri, 16 Apr 2021 at 08:41, Greg A. Woods <woods%planix.ca@localhost> wrote:
>
> > What else is different?  What am I missing?  What could be different in
> > NetBSD current that could cause a FreeBSD domU to (mis)behave this way?
> > Could the fault still be in the FreeBSD drivers -- I don't see how as
> > the same root problem caused corruption in both HVM and PVH domUs.
>
> Random data collection thoughts:
>
> - Can you reproduce it on tiny partitions (to speed up testing)
> - If you newfs, shutdown the DOMU, then copy off the data from the
> DOM0 does it pass FreeBSD fsck on a native boot
> - Alternatively if you newfs an image on a native FreeBSD box and copy
> to the DOM0 does the DOMU fsck fail
> - Potentially based on results above - does it still happen with a
> reboot between the newfs and fsck
> - Can you ktrace whichever of newfs or fsck to see exactly what its
> writing (tiny *tiny* filesystem for the win here :)

So, the root filesystem is clean (from the factory, and verified by at
least NetBSD's fsck as OK), but when '-f' is used it is found to be
corrupt.

Unfortunately I don't have any real FreeBSD machines available (though I
could possibly get it installed on my MacBookPro again, but that's
probably a multi-day effort at this point).

However I've just found a way to reproduce the problem reliably and with
a working comparison with a matching-sized memory disk.

First off attach a tiny 4mb LVM LV to FreeBSD -- that's the smallest LV
possible apparently:

dom0 # lvm lvs
  LV          VG      Attr   LSize   Origin Snap%  Move Log Copy%  Convert
  build       scratch -wi-a- 250.00g
  fbsd-test.0 scratch -wi-a-  30.00g
  fbsd-test.1 scratch -wi-a-  30.00g
  nbtest.pkg  vg0     -wi-a-  30.00g
  nbtest.root vg0     -wi-a-  30.00g
  nbtest.swap vg0     -wi-a-   8.00g
  nbtest.var  vg0     -wi-a-  10.00g
  tinytest    vg0     -wi-a-   4.00m
dom0 # xl block-attach fbsd-test format=raw, vdev=sdc, access=rw, target=/dev/mapper/vg0-tinytest


Now a run of the test on the FreeBSD domU (first showing the kernel
seeing the device attachment):


# xbd3: 4MB <Virtual Block Device> at device/vbd/2080 on xenbusb_front0
xbd3: attaching as da2
xbd3: features: flush
xbd3: synchronize cache commands enabled.
GEOM: new disk da2

# dd if=/dev/zero of=tinytest.fs count=8192
8192+0 records in
8192+0 records out
4194304 bytes transferred in 0.081106 secs (51713998 bytes/sec)
# mdconfig -a -t vnode -f tinytest.fs
md0
# newfs -o space -n md0
/dev/md0: 4.0MB (8192 sectors) block size 32768, fragment size 4096
        using 4 cylinder groups of 1.03MB, 33 blks, 256 inodes.
super-block backups (for fsck_ffs -b #) at:
 192, 2304, 4416, 6528
# newfs -o space -n da2
/dev/da2: 4.0MB (8192 sectors) block size 32768, fragment size 4096
        using 4 cylinder groups of 1.03MB, 33 blks, 256 inodes.
super-block backups (for fsck_ffs -b #) at:
 192, 2304, 4416, 6528
# dumpfs da2 >da2.dumpfs
# dumpfs md0 >md0.dumpfs
# diff md0.dumpfs da2.dumpfs
1,2c1,2
< magic 19540119 (UFS2) time    Fri Apr 16 18:48:55 2021
< superblock location   65536   id      [ 6079dc17 1006b3b4 ]
---
> magic 19540119 (UFS2) time    Fri Apr 16 18:49:57 2021
> superblock location   65536   id      [ 6079dc55 348e5947 ]
27c27
< magic 90255   tell    20000   time    Fri Apr 16 18:48:55 2021
---
> magic 90255   tell    20000   time    Fri Apr 16 18:49:57 2021
40c40
< magic 90255   tell    128000  time    Fri Apr 16 18:48:55 2021
---
> magic 90255   tell    128000  time    Fri Apr 16 18:49:57 2021
53c53
< magic 90255   tell    230000  time    Fri Apr 16 18:48:55 2021
---
> magic 90255   tell    230000  time    Fri Apr 16 18:49:57 2021
66c66
< magic 90255   tell    338000  time    Fri Apr 16 18:48:55 2021
---
> magic 90255   tell    338000  time    Fri Apr 16 18:49:57 2021
# fsck md0
** /dev/md0
** Last Mounted on
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
1 files, 1 used, 870 free (14 frags, 107 blocks, 1.6% fragmentation)

***** FILE SYSTEM IS CLEAN *****
# fsck da2
** /dev/da2
** Last Mounted on
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
ROOT INODE UNALLOCATED
ALLOCATE? [yn] n


***** FILE SYSTEM MARKED DIRTY *****


So I ktraced the fsck_ufs run, and though I haven't looked at it with a
fine-tooth comb and the source open, the only thing that seems a wee bit
different about what fsck does is that it opens the device twice, with
O_RDONLY, then shortly before it prints the first "** /dev/da2" line it
reopens it O_RDRW a third time, closes the second one, and then closes
the second one and calls dup() on the third one so that it has the same
FD# as the second open had.

Otherwise it does a few reads of different sizes (all multiples of 512,
none larger than 64kb), sometimes read()+lseek() and sometimes pread(),
and some from each descriptor.

Maybe that's the big difference -- it uses pread(2).

It also appears to never explicitly close the third open, the one that
was dup()ed to replace the second open, so I think the likes of valgrind
would call that a leaked FD.  :-)


If I use "newfs -O1" then the symptoms change a bit, but most
importantly the filesystem can be checked from the NetBSD dom0, and it
checks cleanly, until FreeBSD fsck is run on the domU and marks it
dirty, then NetBSD immediately sees the dirty flag, but no other
damage.


So I'm still not sure how this could be related to simply updating the
dom0 NetBSD kernel (and Xen, but I've now gone through 4.11 and 4.13 and
they both (mis)behave the same way) -- and this was a change which did
not visibly affect any NetBSD domUs as they are happily serving and
building alongside these tests.

All of the above on its own would smell more like a FreeBSD bug
somewhere in their Xen blkfront driver, but it would have to be pretty
deep since the initial corruption I encountered was in a full HVM domU.
All of this worked A-OK before, most recently I believe with an 8.99.32
kernel and Xen 4.8 (and definitely with 7.99 and 4.5 before that).


--
					Greg A. Woods <gwoods%acm.org@localhost>

Kelowna, BC     +1 250 762-7675           RoboHack <woods%robohack.ca@localhost>
Planix, Inc. <woods%planix.com@localhost>     Avoncote Farms <woods%avoncote.ca@localhost>

Attachment: pgpib45i25pvX.pgp
Description: OpenPGP Digital Signature



Home | Main Index | Thread Index | Old Index