Subject: Re: Vinum or other lvm
To: None <port-i386@netbsd.org>
From: Greg A. Woods <woods@weird.com>
List: port-i386
Date: 01/27/2002 17:30:02
[ On Sunday, January 27, 2002 at 13:39:09 (+0100), Jaromir Dolecek wrote: ]
> Subject: Re: Vinum or other lvm
>
> Berndt Josef Wulf wrote:
> > FreeBSD supports vinum as a module... how hard would it be to port?
> 
> AFAIK FreeBSD vinum has zero advantage over raidframe.

Actually it has many disadvantages (though performance is not likely one
of them).  It's VERY fragile w.r.t. handling errors in its configuration
(which is stored in hidden areas on the disk, but not validated when
used), and rather hard to configure correctly (at least for RAID-5).  I
had a lot of crashes and busted filesystems while I was learning how to
configure it correctly, and I learned the hard way that I had to dd
blocks of zeros over the beginnings of the partitions I was using to
clear broken configs and start over.

VINUM is very flexible, and has some interesting differences from
RAIDframe, but overall I'm _MUCH_ happier with RAIDframe in NetBSD.
Other than for the purposes of eliminaing variables when comparing their
implementations I would see no advantage to bringing VINUM to NetBSD.

I'm using VINUM to provide a concatenated RAID-5 partition for /var on a
couple of boxes running Squid with the idea that should any disk fail I
can just pull it and continue with /var in degraded mode until it can be
replaced (the root filesystem is replicated with rsync, and squid knows
how to go on with a missing cache directory, and given the secondary
drives started life as raw copies they are bootable and should contain
the same VINUM configuration).  I've not tested it's recovery abilities
yet though.  :-)

> Particularily, AFAIK it doesn't support dynamic glowing of the set.

VINUM is a real logical volume manager, but for now does nothing to/for
the filesystem layer.  However it can dynamically grow some types of
LVs.  This from vinum(8) on FreeBSD-4.4:

    o   It is possible to increase the size of a concatenated vinum plex, but
         currently the size of striped and RAID-5 plexes cannot be increased.
         Currently the size of an existing UFS file system also cannot be
         increased, but it is planned to make both plexes and file systems
         extensible.

This from vinum(4) on FreeBSD-4.4:

DESCRIPTION
     vinum is a logical volume manager inspired by, but not derived from, the
     Veritas Volume Manager.  It provides the following features:

     o   It provides device-independent logical disks, called volumes.  Vol-
         umes are not restricted to the size of any disk on the system.

     o   The volumes consist of one or more plexes, each of which contain the
         entire address space of a volume.  This represents an implementation
         of RAID-1 (mirroring).  Multiple plexes can also be used for

         o   Increased read throughput.  vinum will read data from the least
             active disk, so if a volume has plexes on multiple disks, more
             data can be read in parallel.  vinum reads data from only one
             plex, but it writes data to all plexes.

         o   Increased reliability.  By storing plexes on different disks,
             data will remain available even if one of the plexes becomes
             unavailable.  In comparison with a RAID-5 plex (see below), using
             multiple plexes requires more storage space, but gives better
             performance, particularly in the case of a drive failure.

         o   Additional plexes can be used for on-line data reorganization.
             By attaching an additional plex and subsequently detaching one of
             the older plexes, data can be moved on-line without compromising
             access.

         o   An additional plex can be used to obtain a consistent dump of a
             file system.  By attaching an additional plex and detaching at a
             specific time, the detached plex becomes an accurate snapshot of
             the file system at the time of detachment.

     o   Each plex consists of one or more logical disk slices, called sub-
         disks.  Subdisks are defined as a contiguous block of physical disk
         storage.  A plex may consist of any reasonable number of subdisks (in
         other words, the real limit is not the number, but other factors,
         such as memory and performance, associated with maintaining a large
         number of subdisks).

     o   A number of mappings between subdisks and plexes are available:

         o   Concatenated plexes consist of one or more subdisks, each of
             which is mapped to a contiguous part of the plex address space.

         o   Striped plexes consist of two or more subdisks of equal size.
             The file address space is mapped in stripes, integral fractions
             of the subdisk size.  Consecutive plex address space is mapped to
             stripes in each subdisk in turn.  The subdisks of a striped plex
             must all be the same size.

         o   RAID-5 plexes require at least three equal-sized subdisks.  They
             resemble striped plexes, except that in each stripe, one subdisk
             stores parity information.  This subdisk changes in each stripe:
             in the first stripe, it is the first subdisk, in the second it is
             the second subdisk, etc.  In the event of a single disk failure,
             vinum will recover the data based on the information stored on
             the remaining subdisks.  This mapping is particularly suited to
             read-intensive access.  The subdisks of a RAID-5 plex must all be
             the same size.

     o   Drives are the lowest level of the storage hierarchy.  They represent
         disk special devices.

     o   vinum offers automatic startup.  Unlike UNIX file systems, vinum vol-
         umes contain all the configuration information needed to ensure that
         they are started correctly when the subsystem is enabled.  This is
         also a significant advantage over the Veritastm File System.  This
         feature regards the presence of the volumes.  It does not mean that
         the volumes will be mounted automatically, since the standard startup
         procedures with /etc/fstab perform this function.

-- 
								Greg A. Woods

+1 416 218-0098;  <gwoods@acm.org>;  <g.a.woods@ieee.org>;  <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>