[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
On Fri, Jan 22, 2010 at 05:46:31AM +0000, David Holland wrote:
> On Thu, Jan 21, 2010 at 10:30:20PM +0000, Michael van Elst wrote:
> > IMHO there need to be three different ways to specify block
> > offsets and block counts:
> > 1. in units of blocks of the physical device
> > 2. in units of blocks of DEV_BSIZE bytes
> > 3. in bytes
> Don't forget: 4. in units of the filesystem block size...
I ommitted this from the list because only the filesystem
itself has the notion of 'filesystem block size', but when
talking to the device it goes back to use DEV_BSIZE. It
becomes clear that 'filesystem block size' is a very private
measure of a filesystem when you think about FFS fragments
where the filesystem already uses a second size and about
aggregated IO where multiple blocks are accessed as one
> > and we need to establish what units are used where.
> IM (fairly strong) O everything should be kept in byte counts, and
> never block counts because if you have more than one unit in use it is
> far too easy to accidentally mix them or provide the wrong one, and
> because they're all the same language-level type there's little hope
> of detecting such problems automatically.
I would like a system where all I/O is measured in bytes, but this
requires a complete redesign for all disk devices and all filesystems.
And you won't get rid of the physical blocks, at some point you
have to translate.
> Furthermore, Murphy's Law dictates that in any particular place the
> count you are given is frequently not in the units you need to give
> something else, and then you end up converting back and forth all over
> everywhere. This serves no purpose and tends to obfuscate the code
This is how it works now. We do translate blocks back and forth
all over the place, except that there a lot of assumptions that
physical block size is the same as DEV_BSIZE.
Also, filesystems organize data in larger chunks. There is always
some translation going on between block or extent numbers and
now DEV_BSIZE offsets or byte offset in your ideal system.
On the filesystem side it won't get simpler.
> > The necessary changes are rather small. In particular, dkwedge_info needs
> > to be extended to keep track of the physical sector size so that the dk
> > driver can do the transformations.
> The physical sector size should be available to callers (just not part
> of the API/ABI) so this ought to be done regardless.
I haven't thought about compatibility issues yet, where is dkwedge_info
exposed to userland?
Michael van Elst
"A potential Snark may lurk in every tree."
Main Index |
Thread Index |