Subject: Re: Supporting sector size != DEV_BSIZE
To: None <tech-kern@netbsd.org>
From: Darrin B. Jewell <jewell@mit.edu>
List: tech-kern
Date: 06/25/2002 03:30:42
Bill Studenmund <wrstuden@netbsd.org> writes:
| > On Mon, 24 Jun 2002, Darrin B. Jewell wrote:
| >
| >   1. The compiled in value for DEV_BSIZE should always be 512
| >   2. existing media precedent should be followed to decide where to
| >      change current uses of the DEV_BSIZE constant.

| I disagree with point 1. We should be free to change DEV_BSIZE as we wish
| (as long as it's 2**n).  That's the only way to tell if we're truly
| independent of it, or if we have lurking dependencies. The one caveat is
| that it won't work to set DEV_BSIZE to be bigger than a disk's media size.
| i.e. we can't (easily)  use 512-byte-sector devices on a 1k-DEV_BSIZE
| kernel.

I'm going to attempt to hold my ground on this point.

 1. having a kernel that only supports media with a single sector size
    at a time is only marginally useful.  Since 512 byte sector media
    is so common, a system with DEV_BSIZE of anything larger is
    crippled.

 2. How are you going to find all the places where someone used
    DEV_BSIZE when they really just meant 512?  How about the places
    where they used 512 or some other indirect computation where they
    really meant DEV_BSIZE?  The meaning of DEV_BSIZE has gotten so
    obscured and diluted that it is no longer useful for anything
    other than as an indicator of the number 512.  If we don't change
    the value of DEV_BSIZE, we don't have to worry about finding
    any of these cases.

| > | >   . units based on the disklabel d_secsize
| > | >      ( this should always match the hardware device)
| > |
| > | Note: the latter isn't necessarily true. If you take a disk image & move
| > | it to another system, it may change. Folks wish to continue using the
| > | disklabel number.
| >
| > This is why I mentioned it.  I am not as adamant about this,
| > but I was thinking that the in core value for this field
| > should always match the hardware sector size.  Currently,
| > the device strategy routines use d_secsize to interpret
| > bp->b_blkno.   If d_secsize does not match the hardware sector
| > size, then the device strategy routines will need to be
| > modified to do the appropriate conversion.
| 
| I think the modification has already been done. Looking at sd.c for
| instance, it happily translates from block numbers in DEV_BSIZE units into
| local sectors. It assumes that the sector size == the disklabel size, but
| that's a reasonable assumption.
|
| I'm not sure about your use of "interpret." Yes, d_secsize is used when
| looking at b_blkno. But b_blkno is kept in units of DEV_BSIZE.

I think you are agreeing with me on this point.  All I stated was
that the hardware sector size has to equal the disklabel d_secsize.

| > Careful, we need to keep compatibility with other vendors who
| > have already made this choice, for better or worse.  I have to
| 
| What about *ourselves*? I think compatibility with ourselves is the most
| important feature we can have. :-)

I most emphatically agree.  I am against any change which results
in any differences at all to an existing filesystem.

| Looking at NetBSD 1.5, we have already defined a format for what NetBSD's
| ffs does on non-512 byte media. Mainly as the only way to make it work
| there is to have set DEV_BSIZE to match the disk. :-) That set a
| zero-modification design spec, which is why I suggested it.

So?  It is not important to support setting DEV_BSIZE as the method
used to support non-512 byte media.  If the 1.7 kernel with a
DEV_BSIZE of 512 can support any 2**n hardware media size, who
cares if they can change DEV_BSIZE?  Furthermore, have you tested
to verify that NetBSD 1.5 actually works if DEV_BSIZE is not 512?
Even if it does work, NetBSD 1.5 kernels compiled with differing
values for DEV_BSIZE will have completely incompatible filesystem s,
if for no other reason than DIRLBKSIZ will have changed.

| I think they all did different things. I know HP-UX has DEV_BSIZE=1024,
| and Convex's BSD OS (forgot the name) had DEV_BSIZE=2048.

I don't care to keep compatibility with their kernel defines, only
compatibility with the filesystems they generate.  Repeating their
mistakes is an even worse mistake.

| I think what would be more interesting is to see what Apple/Next did on
| the old 1k media.

The NeXTstep 3.3 CD filesystem is an example of their 2k media.  It is
supported by current Apple systems.  You can browse the Darwin source
code to see what they did to accommodate.  I can also dig up 1k media
images from NeXTstep if you would like them, but this requires me
to assemble and boot some crufty equipment.

| Also interesting would be to see what some inodes look like.

The NeXT inodes are standard BSD FS_42INODEFMT which we still support.  See
also NetBSD pr bin/15449, which has a patch I plan to commit shortly.

| > As I mentioned, here is the partial
| > dumpfs output from a nextstep 3.3 operating system distribution CD:
| 
| Nothing in it looked bad or conflicting with what Trevin was doing.

Probably not, although there was not enough information in the part I
sent to see what they units they use for the cylinder block count or
quota counts.  Since I can't remember offhand what I found last time
I checked, I will have to look it up.

Keep in mind that my experience in this area comes from the war zone.
My familiarity is a result of copying the NeXTstep CD image onto a 512
byte media and getting NetBSD 1.5 to use it.  This was straightforward
to do without changing DEV_BSIZE, and I believe simultaneously
supporting various media sector sizes is an achievable direct step
from there.

Stepping back from the details of the implementation, let me briefly
list my requirements here.  This list is not a suggested priority or
implementation order.

  . do not affect a single bit in the current NetBSD filesystem layout.
  . simultaneously support multiple media sector sizes of the form 2**n
  . support filesystem images generated on other operating systems
    that support media other than 512 bytes
  . support filesystem images that have been copied between media
    that has differing media sector sizes.
  . do not make a change that prevents any item on this list from
    being eventually achievable.

Darrin