tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: zfs and device name changes
hi,
On Mon, Mar 30, 2026 at 8:13 PM Stephen Borrill <netbsd%precedence.co.uk@localhost> wrote:
>
> On Fri, 27 Mar 2026, Takashi YAMAMOTO wrote:
> > On Fri, Mar 27, 2026 at 9:17 PM Stephen Borrill <netbsd%precedence.co.uk@localhost> wrote:
> >>
> >> On Fri, 27 Mar 2026, Takashi YAMAMOTO wrote:
> >>> hi,
> >>>
> >>> On Thu, Mar 26, 2026 at 12:41 PM Taylor R Campbell <riastradh%netbsd.org@localhost> wrote:
> >>>>
> >>>>> Date: Tue, 24 Mar 2026 09:14:37 +0900
> >>>>> From: Takashi YAMAMOTO <yamt9999%gmail.com@localhost>
> >>>>>
> >>>>> the attached patch is my attempt to make zfs a bit more robust against
> >>>>> device name changes.
> >>>>> the identical patch is available at github too:
> >>>>> https://github.com/yamt/netbsd-src/commit/32283c2e362034301c3da218a05849c04ee20c2a
> >>>>>
> >>>>> while it seems working as far as i tested, i'd be happy if someone can review it
> >>>>> as my knowledge of zfs (well, and recent netbsd in general) is weak.
> >>>>
> >>>> I don't understand why all this new code is needed. Doesn't zfs
> >>>> already have logic to scan all disks/partitions/wedges and find the
> >>>> vdevs by guid?
> >>>
> >>> which code are you talking about?
> >>> it's entirely possible i'm missing something as i'm new to the code base.
> >>>
> >>>>
> >>>> I am under the impression that /etc/zfs/zpool.cache may bypass the
> >>>> scan so this doesn't work in some circumstances, but in my years of
> >>>> using zfs on various machines with frequent device renumbering of cgd
> >>>> volumes and dkN wedges, I have never encountered this type of trouble
> >>>> myself, and I'm not sure what I'm doing differently.
> >>>
> >>> do you mean zfs finds vdevs after renumbering without zpool import?
> >>> it doesn't match my experience.
> >>> without this patch, i had to use zpool export/import after:
> >>> - modify gpt in a way affecting dk numbering
> >>> - swapping qemu disk images
> >>
> >> Naively:
> >>
> >> # zpool create tank mirror xbd2 xbd3 mirror xbd4 xbd5
> >> # zpool status
> >> pool: tank
> >> state: ONLINE
> >> scan: none requested
> >> config:
> >>
> >> NAME STATE READ WRITE CKSUM
> >> tank ONLINE 0 0 0
> >> mirror-0 ONLINE 0 0 0
> >> xbd2 ONLINE 0 0 0
> >> xbd3 ONLINE 0 0 0
> >> mirror-1 ONLINE 0 0 0
> >> xbd4 ONLINE 0 0 0
> >> xbd5 ONLINE 0 0 0
> >>
> >> errors: No known data errors
> >> # halt -p
> >>
> >> ** Remove xbd1 to simulate failed/disconnected disk
> >> ** means xbd2 -> xbd1, xbd3 -> xbd2, etc.
> >>
> >> After boot:
> >>
> >> # zpool status
> >> pool: tank
> >> state: UNAVAIL
> >> status: One or more devices could not be opened. There are insufficient
> >> replicas for the pool to continue functioning.
> >> action: Attach the missing device and online it using 'zpool online'.
> >> see: http://illumos.org/msg/ZFS-8000-3C
> >> scan: none requested
> >> config:
> >>
> >> NAME STATE READ WRITE CKSUM
> >> tank UNAVAIL 0 0 0
> >> mirror-0 UNAVAIL 0 0 0
> >> 6289893268167966748 FAULTED 0 0 0 was /dev/xbd2
> >> 4017376292647041077 FAULTED 0 0 0 was /dev/xbd3
> >> mirror-1 UNAVAIL 0 0 0
> >> 4378765686596708079 FAULTED 0 0 0 was /dev/xbd4
> >> 6863498524284650610 UNAVAIL 0 0 0 was /dev/xbd5
> >>
> >> After yamt's patch:
> >>
> >> dmesg shows:
> >> ZFS WARNING: vdev guid mismatch for /dev/xbd2, actual 37c09674044d7835
> >> expected 574a309a2445c01c
> >> ZFS: trying to find a vdev (/dev/xbd2) by guid 574a309a2445c01c
> >> ZFS WARNING: vdev guid mismatch for /dev/xbd3, actual 3cc48031384426ef
> >> expected 37c09674044d7835
> >> ZFS: trying to find a vdev (/dev/xbd3) by guid 37c09674044d7835
> >> ZFS WARNING: vdev guid mismatch for /dev/xbd4, actual 5f400b8b20678472
> >> expected 3cc48031384426ef
> >> ZFS: trying to find a vdev (/dev/xbd4) by guid 3cc48031384426ef
> >> ZFS: trying to find a vdev (/dev/xbd5) by guid 5f400b8b20678472
> >
> > my patch doesn't scan xbd disks as it wasn't included in the list.
> > (see device_is_eligible_for_vdev)
> > if you add it, it should be found.
>
> Indeed. I should have look at the patch more closely!
>
> After adding xbd to the list then, if I remove xbd1, dmesg shows:
>
> ZFS WARNING: vdev guid mismatch for /dev/xbd2, actual 37c09674044d7835
> expected 574a309a2445c01c
> ZFS: trying to find a vdev (/dev/xbd2) by guid 574a309a2445c01c
> ZFS: vdev 574a309a2445c01c found: xbd1 (bdev 142:1)
> ZFS WARNING: vdev guid mismatch for /dev/xbd3, actual 3cc48031384426ef
> expected 37c09674044d7835
> ZFS: trying to find a vdev (/dev/xbd3) by guid 37c09674044d7835
> ZFS: vdev 37c09674044d7835 found: xbd2 (bdev 142:2)
> ZFS WARNING: vdev guid mismatch for /dev/xbd4, actual 5f400b8b20678472
> expected 3cc48031384426ef
> ZFS: trying to find a vdev (/dev/xbd4) by guid 3cc48031384426ef
> ZFS: vdev 3cc48031384426ef found: xbd3 (bdev 142:3)
> ZFS: trying to find a vdev (/dev/xbd5) by guid 5f400b8b20678472
> ZFS: vdev 5f400b8b20678472 found: xbd4 (bdev 142:4)
> ZFS WARNING: vdev guid mismatch for /dev/xbd2, actual 37c09674044d7835
> expected 574a309a2445c01c
> ZFS: trying to find a vdev (/dev/xbd2) by guid 574a309a2445c01c
> ZFS: vdev 574a309a2445c01c found: xbd1 (bdev 142:1)
> ZFS WARNING: vdev guid mismatch for /dev/xbd3, actual 3cc48031384426ef
> expected 37c09674044d7835
> ZFS: trying to find a vdev (/dev/xbd3) by guid 37c09674044d7835
> ZFS: vdev 37c09674044d7835 found: xbd2 (bdev 142:2)
> ZFS WARNING: vdev guid mismatch for /dev/xbd4, actual 5f400b8b20678472
> expected 3cc48031384426ef
> ZFS: trying to find a vdev (/dev/xbd4) by guid 3cc48031384426ef
> ZFS: vdev 3cc48031384426ef found: xbd3 (bdev 142:3)
> ZFS: trying to find a vdev (/dev/xbd5) by guid 5f400b8b20678472
> ZFS: vdev 5f400b8b20678472 found: xbd4 (bdev 142:4)
>
> And the pool configures. It is slightly odd from a UI point of view that
> the old device name are still used (they are actually xbd1-xbd4 now).
>
> # zpool status
> pool: tank
> state: ONLINE
> scan: none requested
> config:
>
> NAME STATE READ WRITE CKSUM
> tank ONLINE 0 0 0
> mirror-0 ONLINE 0 0 0
> xbd2 ONLINE 0 0 0
> xbd3 ONLINE 0 0 0
> mirror-1 ONLINE 0 0 0
> xbd4 ONLINE 0 0 0
> xbd5 ONLINE 0 0 0
>
> errors: No known data errors
>
> Thanks!
thank you for testing.
i agree on the ui oddness.
i guess it's same on other OSes with non-path vdev lookup methods. but
i'm not sure as i haven't used zfs on other OSes.
anyway, i'm not going to commit this patch anytime soon as simon might
come up with something better.
>
> --
> Stephen
Home |
Main Index |
Thread Index |
Old Index