Subject: Re: Supporting sector size != DEV_BSIZE -- patches
To: None <tech-kern@netbsd.org>
From: Trevin Beattie <trevin@xmission.com>
List: tech-kern
Date: 06/04/2002 13:25:35
At 09:49 AM 6/4/2002 -0700, Bill Studenmund wrote:
>> The reason I changed fs_fsbtodb to use DEV_BSIZE units is because most of
>> the instances of fsbtodb that I've found pass the result to bread() or
>> set/compare the value to b_blkno, which require DEV_BSIZE units.
>
>The ones that pass it to bread should use the alternate fs_fsbtodb.

I like Chuck's suggestion (fs_fshift - DEV_BSHIFT).  Get rid of fs_fsbtodb
completely.

If we really need to reference the sector size, should we have that stored
in the file system, or grab it from the disk label, or just ask the
hardware?  First of all, you have to convince me that we do need the sector
size at the file system level.

>
>> >But is our implementation breaking?
>>
>> With the exception of NeXTSTEP 3.0, I haven't found an implementation that
>> *isn't* broken WRT large sectors.  This includes NetBSD, although I must
>> say our implementation has greatly improved since I first looked at the
>> problem, at least three years ago.
>
>Did you try compiling a NetBSD kernel (and userland) with DEV_BSIZE set to
>2048?

I haven't yet; I strongly suspect that if I did, the new kernel wouldn't be
able to read my existing file system any more.  Sounds interesting, though,
and since I'm currently running NetBSD on a virtual machine, it wouldn't
hurt to try it out.

>
>That's the supported way of handling 2k-sector disks. And unless we broke
>it, it works (and has worked for YEARS).
>
>If you didn't compile a system with a changed DEV_BSIZE, then yes, it
>doesn't work.
>
>That's what we're trying to fix.
>
>However since things would work if we'd made DEV_BSIZE == sector size, we
>have a spec of what to expect.

Let me think about that... you're saying that whatever changes we make, a
file system created on a 2k/sector disk on a machine with DEV_BSIZE = 512
should be laid out exactly the same as one created on an unpatched system
with DEV_BSIZE = 2048?  That sounds reasonable.  I'd be perfectly happy,
though, if we could get a file system created by the former machine that
simply _works_ with the latter machine, and vice-versa.  (That is, mkfs may
lay out the blocks differently, but the superblock data is defined well
enough to allow both machines to read either disk.)

>
>> To sum up: the i386 port gets the sector boundary check wrong by a factor
>> of sector-size / DEV_BSIZE.  (I haven't looked at the other ports to see
>> whether they have similar bugs.)  ffs_mountfs(9) is broken WRT the location
>> of the super-block; it passes a sector number to bread, which is expecting
>> a DEV_BSIZE block number.  There were also several smaller problems with
>> mkfs computing fs fields in the superblock in terms of sector size; I don't
>> recall what they were offhand, but they'll be easy to recreate if you're
>> interested.
>
>mkfs should do things in terms of sectors. Only in-kernel ffs should deal
>with DEV_BSIZE.

But the kernel ffs depends on values stored in the super-block, which are
initialized by mkfs.  Here again, if we could get rid of everything in the
fs that was dependent on DEV_BSIZE or sector size, this wouldn't be an issue.

>
>> >I think that when you add a non-DEV_BSIZE disk to a system, our FS tools
>> >make/read the file system exactly as a kernel with DEV_BSIZE==that disk
>> >size would.
>>
>> I'm sorry; trying to figure that one is giving me a headache again. :^P
>
>That's ok. :-)
>
>Just think of DEV_BSIZE as a kernel-only parameter. Everything else should
>be in terms of sectors.
>
>> >It should work with the actual sector size now. It just won't transition
>> >to DEV_BSIZE too well.
>>
>> It's just something I have to visualize and understand before I can trust
>> it.  Maybe if I tried to figure it out on paper...
>
>Have you read either the Red Book (Design & Implementation of the 4.4 BSD
>operating system by McKusic) or read Kirk's paper on FFS? Thous would
>probably help.

I've got the Red Book sitting on my book shelf; hadn't used it in a year or
two.  A brief scan through chapters 6-8 didn't reveal much though; it seems
mostly focused on design philosophy, with little implementation details.
The closest reference I could find to DEV_BSIZE is on p. 227: "In earlier
versions of BSD and in most other versions of UNIX, buffers were identified
by their physical disk block number.  4.4BSD changes this convention to
identify buffers by their logical block number within the file."

I'm not familiar with Kirk's paper.  Is this available online?

-----------------------
Trevin Beattie          "Do not meddle in the affairs of wizards,
trevin@xmission.com     for you are crunchy and good with ketchup."
      {:->                                     --unknown