tech-kern: Re: Logical Volume Managers

Subject: Re: Logical Volume Managers
To: Christian Limpach <chris@Nice.CH>
From: Bill Studenmund <wrstuden@zembu.com>
List: tech-kern
Date: 06/28/2000 10:10:38
On Wed, 28 Jun 2000, Christian Limpach wrote:

> I need per-vg and per-lv storage but accessing a lv needs also access to the
> per-vg storage of the vg this lv is in unless you want to duplicate this

Just add a pointer from the lv to the vg storage. :-)

Look at how the zsc/zstty (Zilog serial port code) code works. There is
storage for each zs chip, and code for the things which can live on that
chip. The latter have pointers into the chip-specific data (well, channel
specific).

> information in each per-lv storage.  I tend to prefer to use one softc for

One single struct, or one copy of that struct?

> the whole lvm system, except that I'm somewhat unclear on these points:
> - is the number of items in one pool limited or is there a performance
> advantage to use different pools for each vg or each lv?
> - since there is space for a disklabel in the struct disk and this space is
> used for on-the-fly generation of disklabels for DIOCGPART, can this break?

Maybe. I'd need to see the code.

> - are there advantages to allocating memory at boot time versus allocating
> memory as needed?

Not sure.

> My strategy routine gets a single struct buf to process.  If this request
> spans several physical extents or striping is used, I need to make several
> strategy calls to the drivers which are used for the actual storage of the
> data.  I use buffers which I manage via pool_init/pool_get/pool_put.  This
> is my understanding of the code ccd uses to do a similar thing.

Ahhh..

> I think I need to use disklabels since that's the structure which is used to
> present a partition (=logical volume) to the rest of the system.  My driver
> gets some DIOCGPART ioctls now and then and I don't know what would happen
> if I didn't process them correctly.

Hmmm... Yeah, that can be a problem. You'd break things if you didn't
support them.

> I think the problem is that disklabels are used for two things:
> - dividing up the space on a disk into several areas
> - representing partitions to the rest of the system
> I think they are suited quite well for the first use but not too well for
> the second use (see sbin/newfs.c and how it tries to find out the size of
> the partition it is going to initialize...).  There's also the limit on the
> number of partitions without which I would have used one disklabel to hold
> all the logical volumes (=partitions) of one volume group.  Instead I use
> one disklabel for each logical volume which is generated on the fly.

I misunderstood. I agree fully with you here, and think it makes a lot of
sense. :-)

> yes and no, wouldn't it be nicer to be able to use as many ccd's as you
> want?  I mean ccd's can be configured at any time, not only at boot and I
> don't see any reason in the ccd code why it wouldn't be possible to only
> allocate the memory needed when the ccd is configured.  It's "oh, there is a
> ccd here btw" which the user can trigger at any time.

It would be nice, but no one's implimented it yet. :-(

> > I'm not sure, but I think it'd be fine to make vg's pseudo-devices (you'll
> > probably have a good handle on the number of vg's you'll have around at
> > config time). Also, I think you can use the config framework to find lv's
> > under vg's (if not, you can probably talk Jason into doing it).
> 
> hmm, I would advocate against this since it will make the code in the kernel
> a lot bigger.  As it is now, the part in the kernel only uses the data
> structures it gets passed from the userspace programs.  The kernel won't
> read the configuration from disks and parse it.  I think it's better to do
> all this in userspace.

Oh, I was not clear.

pseudo-devices are statically aloocated at boot, while physical devices
are allocated on the fly (this is an approximation, but a good &
general one. cds, sds, wds, and a lot of other things match it). The
difference is that "physical" devices have a parent which will come along
and say, oh, hey, I found another one of these. pseudo-devices don't have
such a parent, so there's nothing to come along and say, I found more. So
they are allocated at boot.

The trick is that the parent can "find" the device in response to a user
program initiating a special ioctl(). It doesn't have to go scan disks.

So the code you have now probably would work, I'm just suggesting tweaking
the way lv's and vg's find each other and are allocated. Oh, implicit in
this is that vg's (the parents) and lv's (the children) have different
softc structures. And children usually have pointers to their parents.

Take care,

Bill