tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: vnode_has_large_blocks() (vnd.c rev 1.255)



On Thu, Sep 06, 2018 at 10:09:35AM +0200, Manuel Bouyer wrote:

> but that's not going to change. If you move a virtual machine from a 512b
> to a 4k sector disk, you expect the virtual machine to still run.
> If you change the virtual's disk sector size its filesystems will
> probably be unusable.

Actually, I wouldn't expect this if the backing store is really a device.
Using a file as a backing store obviously is different and the filesystem
abstraction is supposed to handle this, but it may cost performance.

But that could be solved by making xbd pass through the geometry. If
a vnd is used as backing store, the geometry can still be simulated.

> Hum, in this case, sc_geom contains what was set at VNDIOCSET time isn't it ?

Yes, that's how the virtual geometry is set.


> Unless we provided a geometry at vnconfig time, it'll always have 512b
> sectors. This is not read from the vnd's disklabel.

Yes. Obviously the vnd's disklabel must be consistent with the vnd
geometry.


> Sure but that doens't seems to be a problem. the backing filesystem
> is 64k/8k, yet I can use filesystems with smaller fragments in the domUs,
> without problems. It looks like VOP_BMAP/VOP_STRATEGY deals with it
> (actually I think it's write only the relevant physical sectors, even if
> that's not a full fragment, because that's how nbp is set up).

The filesystem itself only does I/O in terms of fragment sized blocks (or
multiples). I'm not sure how far VOP_BMAP/VOP_STRATEGY work if you do
smaller I/O requests, in particular when you access the same offset with
different block sizes. The buffer cache only matches the offset and
assumes that a "block" in memory has always a specific size.
So while reading or writing through vnd might work with such a partial
block (if still as large as a physical sector), it will at least
corrupt I/O when done on the file itself.

In any case, if the filesystem uses 512byte frags, the fast path should
be used and apparently isn't.


Greetings,
-- 
                                Michael van Elst
Internet: mlelstv%serpens.de@localhost
                                "A potential Snark may lurk in every tree."


Home | Main Index | Thread Index | Old Index