Subject: Re: Supporting sector size != DEV_BSIZE
To: Trevin Beattie <trevin@xmission.com>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 10/07/2001 08:28:40
On Mon, 8 Oct 2001, Trevin Beattie wrote:

> At 05:38 AM 10/7/2001 -0700, you wrote:
> >Well, it's nice that you feel they have no basis in reality. They've been
> >that way for quite a while. :-)
>
> So I see, according to PR kern/495 (7 years old!)

They are FAR older than that. I'm not sure, but I'd say like BSD43. I'd
put it at around 15 to 20 years old.

> >I've been there and done that. You can check out the results on the
> >wrstuden-dev-bsize branch. I had scsi disks working fine, with both 512
> >and 2048 byte sectors in the same system. i386 floppies had a problem.
>
> Where would I find that branch?

It's in the cvs repository. I think it only covers syssrc, but it might
also cover basesrc.

> >There are three PRs in the database talking about different ways to fix
> >this. They were written by a Japanese developer, who unfortunatly died in
> >a motorcycle accident shortly there after. My efforts on the branch were
> >following one of the PRs (forgot which one) which got rid of DEV_BSIZE.
>
> I've seen PR's kern/3790, 3791, & 3792 (which were written for NetBSD-1.2E)
> and even tried applying them to NetBSD-1.4.2, but still had problems.  The
> later PR's deal with the Amiga port.  Although these early patches seem
> better developed than mine, the last NetBSD release has well over 300
> expressions which evaluate DEV_BSHIFT, and none of the patches appear to
> cover all of them.

It's been a while, but as I recall, there were three ideas:

1) The buffer cache uses the native device block size for blocknumbers.

2) The buffer cache uses DEV_BSIZE for block sizes, but sizes/offsets must
be evenly translatable into native blocks. For instance with DEV_BSIZE =
512, on an optical device w/ 2048-byte-blocks, all your block counts and
offsets have to be divisable by 4, since 4 DEV_BSIZE blocks == one
on-media block.

3) The buffer cache uses DEV_BSIZE for block sizes, and offsets & sizes
can be whatever. The drivers have to deal with the case where a block in
the buffer cache represents part of an on-disk block.

My efforts were using approach 1, and I think Chuck is doing 2.

> >The branch is abandoned at the moment. Among other things, Chuck Silvers,
> >who has done the UBC work, has followed on with a different approach (one
> >of the other PRs). As he is making things happen and I am not in a
> >position to work on it, I'll let him finish things up.
>
> Is there any documentation of his efforts and progress?

I don't think so.

> >See past threads and the PRs. You've basically presented two of the three
> >options.
>
> >From the introductory comments, I can't tell the difference between PR's
> 3790 and 3792.  3792 just has more changes.  (In fact, the initial part of
> 3792 is almost identical to 3790.)

I think the PRs are solutions 2, 1, and 3 respectivly, but it's been a
while.

> What I'd like more than seeing a patch, though, is a well thought-out (and
> documented) design.  The meaning of block size vs. fragment size vs. sector
> size should be explained, their usage and boundaries at various levels of
> the kernel clearly defined, and their storage locations in disk structures
> and in memory given.  (For example, the sector size given in a disk label
> should be used only to convert partition offsets and sizes from sectors to
> bytes, not for determining block numbers in a partition (which are in terms
> of fragment size) or even physical sector numbers, since (as was pointed
> out by Chris G. Demetriou in PR 3460) a file system could be dumped from
> one disk to another having a different sector size.)

The problem is that block size and fragment size are artifacts of ffs, and
are independent of sector size, which is the purvue of device drivers and
the buffer cache. Also, to make matters worse, what is advertized as an
ffs fragment in the user docs is refered to as a block in the code! :-)

I think I had the branch such that an ffs file system could be copied from
a disk of one size to another w/o problem, as long as the fragment size
was greater or equal to the media block size.

Take care,

Bill