Subject: Re: The demise of DEV_BSIZE
To: Chris Torek <torek@BSDI.COM>
From: Bill Studenmund <wrstuden@nas.nasa.gov>
List: tech-kern
Date: 10/06/1999 10:13:41
On Wed, 6 Oct 1999, Chris Torek wrote:

> >Both character and block devices have gained a new function call, d_bsize:
> >void    (*d_bsize)      __P((dev_t dev, int * bshift, int * bsize));
> >which fills in the bshift and bsize values for a device.
> 
> You do not need both (and as cgd noted, "unconfigured device" seems
> like it should be an error-return).  The reason you only need one
> is that all you need is the number of blocks.  The caller can
> translate that into a shift if appropriate, e.g.:
> 
> 	ispow2 = (size & (size - 1)) == 0;
> 	shift = ispow2 ? ffs(size) - 1 : 0;

Ewww. ffs(). One of the things I'm adding is an intlog2() routine which
tells you the power of 2 of the most significant bit set. It uses a binary
search method, so for a u_int32_t, it takes 5 comparisons (well, 6 for
either 0 or 1).

My other concern is that I'd rather we not be recalculating the shift
values over and over. Right now the shift gets calculated when a device
driver notes the block size for a device. I'd like to keep caching that
value everywhere we need to. :-)

I kinda like Chuck's idea of using negative values for non-power-of-2
block sizes. Then we're down to one value, and we always have the shift
when we need it.

> Why not have these report whatever size they like?  A memory-disk
> could even have a programmable block size, if only for testing the code.

The driver certainly can. I'm not trying to say it MUST use DEF_BSIZE. :-)
But I think there will be times when drivers want outside input, which
DEF_BSIZE can give. Well, right now they were using DEV_BSIZE, which had
to go. :-)

> >For instance, UFS keeps track of "blocks" allocated to a file in
> >units of DEV_BSIZE. I've changed this to UFS_BSIZE & UFS_BSHIFT.
> >ufs quotas are in the same unit.
> 
> This seems right.  POSIX probably requires 512 (if it says anything
> at all about this).

The only thing I've seen is that the stat call uses 512 byte blocks. Which
results in sick code. ufs translates a lot of byte sizes into UFS_BSIZE
units, then the ufs_getattr call (which had called the above routines)
translates those UFS_BSIZE block counts into bytes on disk, which the stat
syscalls turn into S_BLKSIZE units (which are 512 bytes). Ick.. At least
it's all being done with btodb & dbtob so it's fast...

Take care,

Bill