Subject: Possible summer of code project: Logical Volume Manager (request for comments!)
To: None <tech-kern@netbsd.org>
From: Cameron Patrick <cameron@patrick.wattle.id.au>
List: tech-kern
Date: 06/04/2005 21:04:44
Hi folks,

I'm thinking of porting (or perhaps reimplementing :-/) Linux's
Logical Volume Manager to NetBSD as a Google Summer of Code project.
This is going to be a kind of long rambley e-mail with my thoughts on
how I might go about it.  Comments on how feasible this is or
misguided I might be would be appreciated ;-)

For those who aren't familiar with it, LVM is a block device that's
layered between physical devices and filesystems, providing more or
less "disklabels on steroids".  You have volume groups (VGs) which are
made up of one or more Physical Volumes (PVs - usually hard drive
partitions or RAID sets), and on top of which live various Logical
Volumes (LVs).  LVs are the equivalents of partitions in an LVM
set-up.  They can span multiple PVs if necessary (providing the
equivalent of NetBSD's CCD), and typically have filesystems living on
top of them.  LVs can be created, removed, expanded and shrunk on the
fly.  They don't have to occupy contiguous regions of disc space, so
it's possible to e.g. create LVs for /home and /var of 5GB each, and
then expand /home a little bit as a time as more space is required.
The physical disc would have /var right in the middle of /home, but
it's all transparent to the layers above it.  A number of Linux
filesystems (ext3, reiser, XFS) can be expanded on-line, so this kind
of re-shuffling generally doesn't even involving unmounting the
partitions.  LVM also provides snapshotting functionality somewhat
similar to NetBSD's fss device.  In short, it's very nifty :-)

Near Equivalents in NetBSD
--------------------------
Vinum: so it's described as a volume manager, but really it's more of
a RAID implementation.  It still only has normal BSD disklabels on top
of it, so you can't do any of the fun expand-partitions-on-the-fly
stuff you can with LVM.

CCD: lets you glue discs together; same limitations as Vinum :-/

growfs: there's a growfs tool in the Netbsd source tree, although the
documentation claims it to be incomplete and not useful for FFS2 yet.
Fixing this would have to be part of a sane LVM implementation for
NetBSD.  Being able to growfs a mounted filesystem would be nice but
implementing that far exceeds my current knowledge of filesystem voodoo.

Previous attempts at porting LVM: this mailing list post -
    http://mail-index.netbsd.org/netbsd-users/2003/02/04/0012.html
describes a port of LVM to NetBSD 1.6 a couple of years ago.  It
appears to have been completely ignored by the world, but would
probably still be a useful starting point.  The referenced pkgsrc
tarball for device mapper says "licensing unclear" which is a bit
off-putting though...

The Linux Implementation
------------------------
Linux's LVM is divided cleanly into the kernel part (device mapper)
and a userspace part (LVM2).  The kernel part is quite clean and
straightforward: it's a general system for setting up block devices
which map onto combinations of other block devices by specifying a
table which might look like, e.g.:
  sectors 0 - 10,000       /dev/wd0f starting at sector 3
  sectors 10,000 - 50,000  /dev/wd1a starting at sector 5,000
  sectors 50,000 - 80,000  /dev/wd0f starting at sector 30,000
Device mapper also supports more complex targets e.g. mirroring,
striping, and encryption; it's used to provide the Linux equivalent of
cgd and also access to RAID volumes on "software RAID in BIOS" cards.

On top of that there's a userspace portion called LVM2 which uses dev
mapper to create devices corresponding to logical volumes.  It takes
care of the on-disc structure of an LVM volume and so on.  It also
looks like it should be fairly portable as almost everything it does
goes through the libdevmapper API.

Approaches to a NetBSD port
---------------------------
1) Something similar to (and/or based on) Christian's previous effort
   would be the simplest way to go.  i.e. a small kernel driver
   similar to Linux's device mapper, and patches to libdevmapper to
   support it in user space.  This has the advantage that just about
   everything that the Linux LVM can do would JustWork(tm) once the
   infrastructure is in place. This has the disadvantage (from a BSD
   perspective) that large chunks of it - i.e. pretty much all the
   userland - would be GPL'ed and so not suitable for inclusion with
   the base system.

2) A complete reimplementation would almost certainly be more effort
   and less featureful than adapting the Linux version - it'd be a
   heck of a lot of work just getting the basic functionality there,
   let alone all the fruit that LVM2 has.  However it could be
   BSD-licensed and better integrated into the NetBSD system.  Having
   LVM work "out of the box" would be nice.  A device mapper-alike
   could probably also swallow the ccd and cgd devices; not sure if
   this is desirable.

3) There's something else that I've missed entirely (e.g. put it in
   the 'too hard' basket).

Considerations for putting root on LVM: on Linux, this is typically
handled by having a separate /boot partition containing the kernel and
bootloader, and an initrd ("initial ramdisk") which contains the LVM
tools and a script to set up the devices and mount the root fs.  This
isn't really practical on NetBSD from my understanding of the boot
process.  A more BSD-ish way would perhaps be to put some kind of
devmapper partition detection into the kernel.  However it's done,
this is very much a secondary concern to be addressed in more detail
(if at all) once LVM is actually working.

Cheers,

Cameron.