tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: one remaining mystery about the FreeBSD domU failure on NetBSD XEN3_DOM0



At Sun, 11 Apr 2021 23:04:29 -0000 (UTC), mlelstv%serpens.de@localhost (Michael van Elst) wrote:
Subject: Re: one remaining mystery about the FreeBSD domU failure on NetBSD XEN3_DOM0
>
> woods%planix.ca@localhost ("Greg A. Woods") writes:
>
> >SALVAGE? [yn] ^Cada0: disk error cmd=write 8145-8152 status: fffffffe
>
> That seems to be a message from the disk driver:

Yes, exactly, that's from the FreeBSD kernel as fsck was trying to
update the superblock and mark the filesystem as dirty (their fsck_ffs
always opens the device for write, even with '-n'); and the error is of
course because the backend has attached the disk as a read-only device.


> The latter case should log a message on Dom0 about DIOCCACHESYNC
> failing.

I haven't seen anything like that yet.


> But if you have sectors of DEV_BSIZE like here there is no difference
> and no conflict.

Yes as far as I've seen the FreeBSD domU reports a sector size of 512
bytes in every xbd(4) device and for every GEOM partition it creates or
finds on those devices.

FreeBSD newfs seems to concur that sectors are 512 bytes even when
writing to a raw (i.e. un-labeled) /dev/da1 (which has a 30GB LVM LV
backing it):


# newfs /dev/da1
/dev/da1: 30720.0MB (62914560 sectors) block size 32768, fragment size 4096
        using 50 cylinder groups of 626.09MB, 20035 blks, 80256 inodes.

$ echo 62914560 \* 512 / 1024 / 1024 | bc -l
30720.00000000000000000000


The NetBSD dom0 reported the attachment of this device with a matching
number of (512-byte) sectors:

	xbd backend: attach device scratch-fbsd--t (size 62914560) for domain 2



> The FreeBSD-12.2-RELEASE-amd64-mini-memstick.img I just fetched
> has two MBR partitions:
>
> Partition table:
> 0: EFI system partition (sysid 239)
>     start 1, size 1600 (1 MB, Cyls 0/0/2-0/50/1)
> 1: FreeBSD or 386BSD or old NetBSD (sysid 165)
>     start 1601, size 789520 (386 MB, Cyls 0/50/2-386/18/17), Active
>
> Making our disklabel program read the FreeBSD disklabel was a bit
> tricky, there is a bug that makes it segfault, but:
>
> type: unknown
> disk:
> label:
> flags:
> bytes/sector: 512
> sectors/track: 1
> tracks/cylinder: 1
> sectors/cylinder: 1
> cylinders: 789520
> total sectors: 789520
> rpm: 3600
> interleave: 0
> trackskew: 0
> cylinderskew: 0
> headswitch: 0           # microseconds
> track-to-track seek: 0  # microseconds
> drivedata: 0
>
> 8 partitions:
> #        size    offset     fstype [fsize bsize cpg/sgs]
>  a:    789504        16     4.2BSD      0     0     0  # (Cyl.     16 - 789519)
>  c:    789520         0     unused      0     0        # (Cyl.      0 - 789519)
>
>
> Apparently the MBR partition 1 starting at sector 1601 is a disk
> image itself and the disklabel is in sector 1 of that image.

Well I think in FreeBSD parlance it just is an MBR partition that has a
BSD label confined within its limits, and that BSD label further divides
its MBR partition into more disk partitions.  That's just the FreeBSD
way -- if I understand correctly their BSD labels are restricted to the
confines of the MBR partition where they sit.

And yes, FreeBSD's disklabel output matches:

# disklabel da0s2
# /dev/da0s2:
8 partitions:
#          size     offset    fstype   [fsize bsize bps/cpg]
  a:     789504         16    4.2BSD        0     0     0
  c:     789520          0    unused        0     0     # "raw" part, don't edit


So in FreeBSD the filesystem there is at "/dev/da0s2a" -- where "da0" is
the "device", "s2" is the second MBR partition, and "a" is of course the
BSD label's "a" partition.  They use more or less the same naming for
GPT entries as well.


> Adding a wedge to access the partition at offset 16 (+1601) gives:
>
> # dkctl vnd0 addwedge freebsd 1617 789504 ffs
> dk6 created successfully.

I had not thought to try that yet.  It's good to see it works!

Now that I can get vnd0d to export the .img file to FreeBSD I think I've
effectively eliminated worries about vnd(4) causing the bigger problems.



Speaking of which, I think this might be evidence that the FreeBSD
system was suffering the effects of accessing the corrupted filesystem I
was experimenting with.  Note the SIGSEGV's from processes apparently
after the kernel has gone into its halt-spin loop (this is the first
time I've seen this particular misbehaviour):


# halt -pq
Waiting (max 60 seconds) for system process `vnlru' to stop... done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining... 0 0 done
Waiting (max 60 seconds) for system thread `bufdaemon' to stop... done
Waiting (max 60 seconds) for system thread `bufspacedaemon-0' to stop... done
Waiting (max 60 seconds) for system thread `bufspacedaemon-1' to stop... done
Waiting (max 60 seconds) for system thread `bufspacedaemon-2' to stop... done
Waiting (max 60 seconds) for system thread `bufspacedaemon-3' to stop... done
All buffers synced.
Uptime: 5h22m39s

The operating system has halted.
Please press any key to reboot.

pid 412 (syslogd), jid 0, uid 0: exited on signal 11
pid 343 (devd), jid 0, uid 0: exited on signal 11


--
					Greg A. Woods <gwoods%acm.org@localhost>

Kelowna, BC     +1 250 762-7675           RoboHack <woods%robohack.ca@localhost>
Planix, Inc. <woods%planix.com@localhost>     Avoncote Farms <woods%avoncote.ca@localhost>

Attachment: pgpEjtCSIkmzu.pgp
Description: OpenPGP Digital Signature



Home | Main Index | Thread Index | Old Index