NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
bin/54591: lvm drops volumes on initial start
>Number: 54591
>Category: bin
>Synopsis: lvm drops volumes on initial start
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Oct 01 20:40:00 +0000 2019
>Originator: Martin Neitzel
>Release: NetBSD 9.99.12 2019-09-21
>Organization:
Gaertner Datensysteme, Marshlabs
>Environment:
System: NetBSD eddie.marshlabs.gaertner.de 9.99.12 NetBSD 9.99.12 (GENERIC) #0: Fri Sep 27 01:08:12 CEST 2019 neitzel%eddie.marshlabs.gaertner.de@localhost:/scratch/obj/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
Upon boot, /etc/rc.d/lvm fails to set up all logical volumes which have
been created. Random entries in /dev/mapper/ are missing.
As a consequence, the missing filesystems cannot be mounted, and depending
on the missing filesystem, the boot may already abort in single user mode.
Recovery can be... tricky.
>How-To-Repeat:
During NetBSD installation, I defined a disklabel(8) partition /dev/rwd0e to
hold the space for an LVM physical volume:
neitzel 6 > disklabel wd0
[...]
total sectors: 234441648
[...]
5 partitions:
# size offset fstype [fsize bsize cpg/sgs]
a: 4194304 2048 4.2BSD 0 0 0 # /
b: 2097152 4196416 swap # swap
c: 41943040 2048 unused 0 0 # NetBSD part.
d: 234441648 0 unused 0 0 # whole disk
e: 35651520 6293568 vinum # LVM PV
I created a simple volume group out of a single physical volume
and for logical volumes:
# lvm pvcreate /dev/rwd0e
# lvm vgcreate vg0 /dev/rwd0e
# lvm lvcreate -L 4g -n src vg0
# lvm lvcreate -L 5g -n scratch vg0
# lvm lvcreate -L 1g -n pkg vg0
# lvm lvcreate -L 2g -n local vg0
I newfs'ed the filesystems on the volumes, prepared mount points and
made /etc/fstab entries as usual. The "noauto" option no-fsck-"0" are
already for the workaround:
/dev/mapper/vg0-local /usr/local ffs rw,noauto 0 0
/dev/mapper/vg0-pkg /usr/pkglv ffs rw,noauto 0 0
/dev/mapper/vg0-scratch /scratch ffs rw,noauto 0 0
/dev/mapper/vg0-src /usr/src ffs rw,noauto 0 0
While you would typically boot with
lvm=YES
in /etc/rc.conf, things get easier to repeat/debug/work araound with lvm=NO
and running things manually. Once in multi-user mode:
pre-flight check:
/root 5 # modstat | grep -w dm
/root 6 # dmsetup table
No devices found
First lvm start, bringing only three out of four volumes on:
/root 7 # /etc/rc.d/lvm onestart
Configuring lvm devices.
Activated Volume Groups: vg0
/root 8 # modstat | grep -w dm
dm driver filesys a 0 18432 dk_subr
/root 9 # dmsetup table
vg0-local: 0 4194304 linear /dev/wd0e 384
vg0-pkg: 0 2097152 linear /dev/wd0e 4194688
vg0-src: 0 8388608 linear /dev/wd0e 6291840
/root 10 # ls -l /dev/mapper
total 0
crw-rw---- 1 root operator 194, 0 Aug 7 00:12 control
crw-r----- 1 root operator 194, 1 Oct 1 21:34 rvg0-local
crw-r----- 1 root operator 194, 2 Oct 1 21:34 rvg0-pkg
crw-r----- 1 root operator 194, 3 Oct 1 21:34 rvg0-src
brw-r----- 1 root operator 169, 1 Oct 1 21:34 vg0-local
brw-r----- 1 root operator 169, 2 Oct 1 21:34 vg0-pkg
brw-r----- 1 root operator 169, 3 Oct 1 21:34 vg0-src
This time, the "scratch" volume was missing. The "3 out of 4"
seems to be fixed, but is random which LV missing.
Revover by restarting the LVM service. In separate steps:
/root 11 # /etc/rc.d/lvm onestop
Unconfiguring lvm devices.
Shutting Down logical volume: vg0/local
Command failed with status code 5.
Shutting Down logical volume: vg0/pkg
Obviously the "stop" runs into inconsistent information, and a bit
of debris is left; the "dm" kernel module stays loaded:
/root 12 # modstat | grep -w dm
dm driver filesys a 0 18432 dk_subr
/root 13 # ls -l /dev/mapper
total 0
crw-rw---- 1 root operator 194, 0 Aug 7 00:12 control
crw-r----- 1 root operator 194, 1 Oct 1 21:34 rvg0-local
brw-r----- 1 root operator 169, 1 Oct 1 21:34 vg0-local
/root 14 # dmsetup table
vg0-local: 0 4194304 linear /dev/wd0e 384
A second start brings all four volumes online:
/root 15 # /etc/rc.d/lvm onestart
Configuring lvm devices.
Activated Volume Groups: vg0
/root 16 # ls -l /dev/mapper
total 0
crw-rw---- 1 root operator 194, 0 Aug 7 00:12 control
crw-r----- 1 root operator 194, 1 Oct 1 21:34 rvg0-local
crw-r----- 1 root operator 194, 4 Oct 1 21:52 rvg0-pkg
crw-r----- 1 root operator 194, 6 Oct 1 21:52 rvg0-scratch
crw-r----- 1 root operator 194, 5 Oct 1 21:52 rvg0-src
brw-r----- 1 root operator 169, 1 Oct 1 21:34 vg0-local
brw-r----- 1 root operator 169, 4 Oct 1 21:52 vg0-pkg
brw-r----- 1 root operator 169, 6 Oct 1 21:52 vg0-scratch
brw-r----- 1 root operator 169, 5 Oct 1 21:52 vg0-src
/root 17 # dmsetup table
vg0-local: 0 4194304 linear /dev/wd0e 384
vg0-pkg: 0 2097152 linear /dev/wd0e 4194688
vg0-src: 0 8388608 linear /dev/wd0e 6291840
vg0-scratch: 0 10485760 linear /dev/wd0e 14680448
"mount -a" and work (almost) as usual.
>Fix:
None known yet. This may well be a category "kern" instead of "bin" bug.
Hey, I'm happy that I got this far to actually be able to load the
sources and still access them on next boot ;-)
It took me three or four installation attempts get a 9.99.x
-current running at all, with the workarounds as decribed here. In
earlier attempts, I tried to install parts of the base system into
LVs and wents nuts because randonly different parts would be missing
upon reboot. My first insallation attempts were with GPT partitioning
and a GPT partition as LVM physical volume, then I reverted to MBR
partitioning, then I made sure nothing critical for a multi-user
login (such as /usr/pkg/bin/tcsh) resided on an LV.
Hence "Severity: serious" & "Priority: high".
Home |
Main Index |
Thread Index |
Old Index