Subject: RE: GPT support still needed? (was: RE: Recursive partitioning)
To: None <tech-kern@NetBSD.org>
From: De Zeurkous <zeurkous@nichten.info>
List: tech-kern
Date: 06/06/2007 11:58:30
Haai,

On Wed, June 6, 2007 11:17, Allen Briggs wrote:
>>[snip]
>
> "Fits perfectly" is where some folks would disagree.

I thought that might be the case :^)

>
> Let's back up for a minute.  One problem that we've had ever since
> the earliest days on an i386-based system is the dichotomy between
> the native partitioning method (call it what you will) and the BSD
> partitioning method (disklabel).  On traditional BSD systems, there
> was just the disklabel.  The PC needed a lower-level format, and
> BSD needed to coexist with it.  So disklabelling became a two-step
> process: prepare the native label, and prepare the BSD label. And
> people want to be able to share BSD with other operating systems--
> Linux, Windows, DOS, OS/2,

Like the IBM PC, those OSes are either obsolescent or outright obsolete.
In a few years, the remaining ones will be gone and no-one is going to
care for running them on a physical system.

> other BSD installs,

This may sound incredibly naive, but we could just agree upon one
disklabel format, right?! There are higher-level issues to deal with...

> whatever.  And so
> we need to be able to treat the BSD portion of the disk separately
> from the whole disk as well as be able to access the whole disk at
> times, so the choice was made to have two 'raw disk' partitions, one
> meaning 'whole disk' and one meaning 'BSD portion of disk.' That
> has been a small thorn in our side for quite a while. It's a bit
> confusing to both people and code, and it makes for one less usable
> partition.

Considering the limitations of the IBM PC this is a logical choice; I see
no reason to change it. If people want more, they should move on to a more
solidly designed architecture.

> Now we're also up to the point where logical volumes
> (if not quite yet individual spindles) can exceed the sizes that we
> can represent in the traditional disklabels.
>
> These are orthogonal issues.  Wedges help to solve some of them, but
> there are also some other reasons that wedges make sense, such as
> the fact that the old Apple partition scheme doesn't map very well
> to disklabels.  We've hacked around it for years, but wedges map
> much better to that system. And we'd like to have a more consistent
> view of the devices between ports.

I'm not proposing we dump wedges -- I'm simply stating that it is a
solution to compatibility problems between BSD, the host architecture,
and/or other OSes. It's not a nice replacement for our current layout for
physical devices.

> And then there's the question of
> how many partitions does that disklabel/port support?  8?  16?

Isn't this even a consistent number? :S Of this I was not aware. Another
reason for a new, unified disklabel.

> It's
> a "flag day" for a port if/when it changes, without a device filesystem,
> and it's a bit of a pain to deal with making devices properly in a
> machine-independent fashion.

Then again, this should have been fixed a long, long time ago; preferably
that mistake should never have been made...

> For just going to wider representations of the partition size &
> offset, sure, it's not too bad to define a disklabel64 or whatever--
> the technical issues surrounding that are relatively easily hacked
> around.  But GPT is an existing format that maps well to wedges,
> which we want for other reasons.

Is some brain-dead existing format better than a new format that is
guaranteed to work? If so, we should consider implementing a full-fledged
ROM BASIC-clone in the kernel debugger...

> It also allows for other systems
> and tools to interoperate with the disks--not an issue for many
> installs, but an issue for some,

Again, legacy stuff which an MBR can perfectly serve. Heck, even Lunix can
put it's file systems in a BSD disklabel, if created with the BSD version
in question...

> and more of an issue for USB or
> firewire (or eSATA) disks that may well be shared between different
> kinds of systems for backups or sharing data or whatever.

We have something called a 'file server' for USB and SATA. As for SCSI,
data is not a problem, and I don't think it's a very good idea to have
multiple systems boot from the same disk, unless very carefully
regulated...

> So why
> have yet another way of looking at the disks?

A new disklabel format != 'another way' of looking at disks; it simply
broadens your view somewhat :) As for recursive partitioning: it's a hack,
but the bells and whistles I proposed are only casually related to it.

> It adds to the
> complexity of low-level code, resulting in more code to compile
> (size) and test / debug.

As for the new disklabel format, this is trivial. As for the recursive
partitioning approach I already mentioned the requirement of having root
FSes (and preferably dump devices) in the main disklabel.

Baai,

De Zeurkous
-----------

Friggin' Machines!

>
> -allen
>