Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/netbsd-9]: src Pull up following revision(s) (requested by riastradh in ...



details:   https://anonhg.NetBSD.org/src/rev/0a9411a78122
branches:  netbsd-9
changeset: 937393:0a9411a78122
user:      martin <martin%NetBSD.org@localhost>
date:      Mon Aug 17 10:30:22 2020 +0000

description:
Pull up following revision(s) (requested by riastradh in ticket #1050):

        sys/ufs/lfs/lfs_subr.c: revision 1.101
        sys/ufs/lfs/lfs_subr.c: revision 1.102
        sys/ufs/lfs/lfs_inode.c: revision 1.158
        sys/ufs/lfs/lfs_inode.h: revision 1.25
        sys/ufs/lfs/lfs_balloc.c: revision 1.95
        sys/ufs/lfs/lfs_pages.c: revision 1.21
        sys/ufs/lfs/lfs_vnops.c: revision 1.330
        sys/ufs/lfs/lfs_alloc.c: revision 1.140 (patch)
        sys/ufs/lfs/lfs_alloc.c: revision 1.141 (patch)
        lib/libp2k/p2k.c: revision 1.72
        sys/ufs/lfs/lfs.h: revision 1.205
        sys/ufs/lfs/lfs.h: revision 1.206
        sys/ufs/lfs/lfs_segment.c: revision 1.284
        sys/ufs/lfs/lfs.h: revision 1.207
        sys/ufs/lfs/lfs_segment.c: revision 1.285
        sys/ufs/lfs/lfs_debug.c: revision 1.55
        sys/ufs/lfs/lfs_rename.c: revision 1.23
        usr.sbin/dumplfs/dumplfs.c: revision 1.65
        sys/ufs/lfs/lfs_vfsops.c: revision 1.371
        sys/arch/i386/stand/efiboot/bootx64/Makefile: revision 1.3
        sys/ufs/lfs/lfs_vfsops.c: revision 1.372
        sys/ufs/lfs/lfs_vfsops.c: revision 1.373
        sbin/fsck_lfs/pass1.c: revision 1.46
        sys/ufs/lfs/lfs_vnops.c: revision 1.326
        sys/ufs/lfs/lfs_vnops.c: revision 1.327
        sys/ufs/lfs/lfs_vfsops.c: revision 1.375 (patch)
        sys/ufs/lfs/lfs_vnops.c: revision 1.328
        sys/ufs/lfs/lfs_subr.c: revision 1.98
        sys/ufs/lfs/lfs_extern.h: revision 1.116
        sys/ufs/lfs/lfs_vnops.c: revision 1.329
        sys/ufs/lfs/lfs_subr.c: revision 1.99
        sys/ufs/lfs/lfs_extern.h: revision 1.117
        sys/ufs/lfs/lfs_accessors.h: revision 1.49
        sys/ufs/lfs/lfs_extern.h: revision 1.118
        sys/rump/fs/lib/liblfs/Makefile: revision 1.15
        sys/ufs/lfs/lfs_bio.c: revision 1.146 (patch)
        sys/ufs/lfs/lfs_bio.c: revision 1.147
        sys/ufs/lfs/lfs_subr.c: revision 1.100

Fix kassert in lfs by initializing vp first.

Use a marker node to iterate lfs_dchainhd / i_lfs_dchain.

I believe elements can be removed while the lock is dropped,
including the next node we're hanging on to.

Just use VOP_BWRITE for lfs_bwrite_log.
Hope this doesn't cause trouble with vfs_suspend.

Teach lfs to transition ro<->rw.

Prevent new dirops while we issue lfs_flush_dirops.

lfs_flush_dirops assumes (by KASSERT((ip->i_state & IN_ADIROP) == 0))
that vnodes on the dchain will not become involved in active dirops
even while holding no other locks (lfs_lock, v_interlock), so we must
set lfs_writer here.  All other callers already set lfs_writer.

We set fs->lfs_writer++ without explicitly doing lfs_writer_enter
because
(a) we already waited for the dirops to drain, and
(b) we hold lfs_lock and cannot drop it before setting lfs_writer.

Assert lfs_writer where I think we can now prove it.

Serialize access to the splay tree with lfs_lock.

Change some cheap KDASSERT into KASSERT.

Take a reference and fix assertions in lfs_flush_dirops.
Fixes panic:
KASSERT((ip->i_state & IN_ADIROP) == 0) at lfs_vnops.c:1670
lfs_flush_dirops
lfs_check
lfs_setattr
VOP_SETATTR
change_mode
sys_fchmod
syscall

This assertion -- and the assertion that vp->v_uflag has VU_DIROP set
-- is valid only until we release lfs_lock, because we may race with
lfs_unmark_dirop which will remove the nodes and change the flags.

Further, vp itself is valid only as long as it is referenced, which it
is as long as it's on the dchain, but lfs_unmark_dirop drops the
dchain's reference.

Don't lfs_writer_enter while holding v_interlock.

There's no need to lfs_writer_enter at all here, as far as I can see.
lfs_flush_fs will do it for us.

Break deadlock in PR kern/52301.

The lock order is lfs_writer -> lfs_seglock.  The problem in 52301 is
that lfs_segwrite violates this lock order by sometimes doing
lfs_seglock -> lfs_writer, either (a) when doing a checkpoint or (b),
opportunistically, when there are no dirops pending.  Both cases can
deadlock, because dirops sometimes take the seglock (lfs_truncate,
lfs_valloc, lfs_vfree):
(a) There may be dirops pending, and they may be waiting for the
seglock, so we can't wait for them to complete while holding the
seglock.
(b) The test for fs->lfs_dirops == 0 happens unlocked, and the state
may change by the time lfs_writer_enter acquires lfs_lock.

To resolve this in each case:
(a) Do lfs_writer_enter before lfs_seglock, since we will need it
unconditionally anyway.  The worst performance impact of this should
be that some dirops get delayed a little bit.
(b) Create a new lfs_writer_tryenter to use at this point so that the
test for fs->lfs_dirops == 0 and the acquisition of lfs_writer happen
atomically under lfs_lock.

Initialize/destroy lfs_allclean_wakeup in modcmd, not lfs_mountfs.

Fixes reloading lfs.kmod.

In lfs_update, hold lfs_writer around lfs_vflush.

Otherwise, we might do
lfs_vflush
-> lfs_seglock
-> lfs_segwait(SEGM_CKP)
   -> lfs_writer_enter
which is the reverse of the lfs_writer -> lfs_seglock ordering.

Call lfs_orphan in lfs_rename while we're still in the dirop.
lfs_writer_enter can't fail; keep it simple and don't pretend it can.

Assert that mtsleep can't fail either -- it doesn't catch signals and
there's no timeout.

Teach LFS_ORPHAN_NEXTFREE about lfs64.

Dust off the orphan detection code and try to make it work.

Fix !DIAGNOSTIC compile

Fix userland references to LFS_ORPHAN_NEXTFREE.

Forgot to grep for these or do a full distribution build, oops!

Fix missing <sys/evcnt.h> by removing the evcnts instead.

Just wanted to confirm that a race might happen, and indeed it did.
These serve little diagnostic value otherwise.

OR into bp->b_cflags; don't overwrite.

CTASSERT lfs on-disk structure sizes.

Avoid misaligned access to lfs64 on-disk records in memory.
lfs64 directory entries are only 32-bit aligned in order to conserve
space in directory blocks, and we had a hack to stuff a 64-bit inode
in them.  This replaces the hack by __aligned(4) __packed, and goes
further:

1. It's not clear that all the other lfs64 data structures are 64-bit
   aligned on disk to begin with.  We can go through these later and
   upgrade them from
        struct foo64 {
                ...
        } __aligned(4) __packed;
        union foo {
                struct foo64 f64;
                ...
        };
   to
        struct foo64 {
                ...
        };
        union foo {
                struct foo64 f64 __aligned(8);
                ...
        } __aligned(4) __packed;
   if we really want to take advantage of 64-bit memory accesses.
   However, the __aligned(4) __packed must remain on the union
   because:
2. We access even the lfs32 data structures via a union that has
   lfs64 members, and it turns out that compilers will assume access
   through a union with 64-bit aligned members implies the whole
   union has 64-bit alignment, even if we're only accessing a 32-bit
   aligned member.

Fix clang build after packed lfs64 accessor change.

Suppress spurious address-of-packed error in rump lfs too.

diffstat:

 lib/libp2k/p2k.c                             |    4 +-
 sbin/fsck_lfs/pass1.c                        |    4 +-
 sys/arch/i386/stand/efiboot/bootx64/Makefile |    7 +-
 sys/rump/fs/lib/liblfs/Makefile              |    7 +-
 sys/ufs/lfs/lfs.h                            |   64 ++++++++--
 sys/ufs/lfs/lfs_accessors.h                  |   25 +---
 sys/ufs/lfs/lfs_alloc.c                      |  162 ++++++++++++++++++++------
 sys/ufs/lfs/lfs_balloc.c                     |   29 ++--
 sys/ufs/lfs/lfs_bio.c                        |   11 +-
 sys/ufs/lfs/lfs_debug.c                      |   10 +-
 sys/ufs/lfs/lfs_extern.h                     |    8 +-
 sys/ufs/lfs/lfs_inode.c                      |   19 ++-
 sys/ufs/lfs/lfs_inode.h                      |    3 +-
 sys/ufs/lfs/lfs_pages.c                      |   27 ++--
 sys/ufs/lfs/lfs_rename.c                     |    7 +-
 sys/ufs/lfs/lfs_segment.c                    |   24 ++-
 sys/ufs/lfs/lfs_subr.c                       |   58 +++++++--
 sys/ufs/lfs/lfs_vfsops.c                     |  163 ++++++++++++++------------
 sys/ufs/lfs/lfs_vnops.c                      |   79 +++++++++---
 usr.sbin/dumplfs/dumplfs.c                   |    6 +-
 20 files changed, 468 insertions(+), 249 deletions(-)

diffs (truncated from 1638 to 300 lines):

diff -r f395940f5de6 -r 0a9411a78122 lib/libp2k/p2k.c
--- a/lib/libp2k/p2k.c  Fri Aug 14 11:05:16 2020 +0000
+++ b/lib/libp2k/p2k.c  Mon Aug 17 10:30:22 2020 +0000
@@ -1,4 +1,4 @@
-/*     $NetBSD: p2k.c,v 1.70 2017/04/26 03:02:48 riastradh Exp $       */
+/*     $NetBSD: p2k.c,v 1.70.14.1 2020/08/17 10:30:22 martin Exp $     */
 
 /*
  * Copyright (c) 2007, 2008, 2009  Antti Kantee.  All Rights Reserved.
@@ -789,7 +789,7 @@
        struct p2k_node *p2n;
        struct componentname *cn;
        struct vattr *va_x;
-       struct vnode *vp;
+       struct vnode *vp = NULL;
        int rv;
 
        p2n = malloc(sizeof(*p2n));
diff -r f395940f5de6 -r 0a9411a78122 sbin/fsck_lfs/pass1.c
--- a/sbin/fsck_lfs/pass1.c     Fri Aug 14 11:05:16 2020 +0000
+++ b/sbin/fsck_lfs/pass1.c     Mon Aug 17 10:30:22 2020 +0000
@@ -1,4 +1,4 @@
-/* $NetBSD: pass1.c,v 1.45 2015/10/03 08:30:13 dholland Exp $   */
+/* $NetBSD: pass1.c,v 1.45.18.1 2020/08/17 10:30:22 martin Exp $        */
 
 /*
  * Copyright (c) 1980, 1986, 1993
@@ -307,7 +307,7 @@
         */
        if (lfs_dino_getnlink(fs, dp) <= 0) {
                LFS_IENTRY(ifp, fs, inumber, bp);
-               if (lfs_if_getnextfree(fs, ifp) == LFS_ORPHAN_NEXTFREE) {
+               if (lfs_if_getnextfree(fs, ifp) == LFS_ORPHAN_NEXTFREE(fs)) {
                        statemap[inumber] = (mode == LFS_IFDIR ? DCLEAR : FCLEAR);
                        /* Add this to our list of orphans */
                        zlnp = emalloc(sizeof *zlnp);
diff -r f395940f5de6 -r 0a9411a78122 sys/arch/i386/stand/efiboot/bootx64/Makefile
--- a/sys/arch/i386/stand/efiboot/bootx64/Makefile      Fri Aug 14 11:05:16 2020 +0000
+++ b/sys/arch/i386/stand/efiboot/bootx64/Makefile      Mon Aug 17 10:30:22 2020 +0000
@@ -1,4 +1,4 @@
-#      $NetBSD: Makefile,v 1.1.26.1 2019/09/17 19:32:00 martin Exp $
+#      $NetBSD: Makefile,v 1.1.26.2 2020/08/17 10:30:22 martin Exp $
 
 PROG=          bootx64.efi
 OBJFMT=                pei-x86-64
@@ -9,4 +9,9 @@
 COPTS+=                -mno-red-zone
 CPPFLAGS+=     -DEFI_FUNCTION_WRAPPER
 
+# Follow the suit of Makefile.kern.inc; needed for the lfs64 union
+# accessors -- they don't actually dereference the resulting pointer,
+# just use it for type-checking.
+CWARNFLAGS.clang+=     -Wno-error=address-of-packed-member
+
 .include "${.CURDIR}/../Makefile.efiboot"
diff -r f395940f5de6 -r 0a9411a78122 sys/rump/fs/lib/liblfs/Makefile
--- a/sys/rump/fs/lib/liblfs/Makefile   Fri Aug 14 11:05:16 2020 +0000
+++ b/sys/rump/fs/lib/liblfs/Makefile   Mon Aug 17 10:30:22 2020 +0000
@@ -1,4 +1,4 @@
-#      $NetBSD: Makefile,v 1.14 2016/03/23 21:38:51 christos Exp $
+#      $NetBSD: Makefile,v 1.14.22.1 2020/08/17 10:30:22 martin Exp $
 #
 
 .PATH:  ${.CURDIR}/../../../../ufs/lfs
@@ -21,5 +21,10 @@
 COPTS.lfs_inode.c+=-O0
 .endif
 
+# Follow the suit of Makefile.kern.inc; needed for the lfs64 union
+# accessors -- they don't actually dereference the resulting pointer,
+# just use it for type-checking.
+CWARNFLAGS.clang+=     -Wno-error=address-of-packed-member
+
 .include <bsd.lib.mk>
 .include <bsd.klinks.mk>
diff -r f395940f5de6 -r 0a9411a78122 sys/ufs/lfs/lfs.h
--- a/sys/ufs/lfs/lfs.h Fri Aug 14 11:05:16 2020 +0000
+++ b/sys/ufs/lfs/lfs.h Mon Aug 17 10:30:22 2020 +0000
@@ -1,4 +1,4 @@
-/*     $NetBSD: lfs.h,v 1.204 2019/01/10 06:31:04 martin Exp $ */
+/*     $NetBSD: lfs.h,v 1.204.4.1 2020/08/17 10:30:22 martin Exp $     */
 
 /*  from NetBSD: dinode.h,v 1.25 2016/01/22 23:06:10 dholland Exp  */
 /*  from NetBSD: dir.h,v 1.25 2015/09/01 06:16:03 dholland Exp  */
@@ -355,19 +355,22 @@
        uint8_t  dh_type;               /* file type, see below */
        uint8_t  dh_namlen;             /* length of string in d_name */
 };
+__CTASSERT(sizeof(struct lfs_dirheader32) == 8);
 
 struct lfs_dirheader64 {
-       uint32_t dh_inoA;               /* inode number of entry */
-       uint32_t dh_inoB;               /* inode number of entry */
+       uint64_t dh_ino;                /* inode number of entry */
        uint16_t dh_reclen;             /* length of this record */
        uint8_t  dh_type;               /* file type, see below */
        uint8_t  dh_namlen;             /* length of string in d_name */
-};
+} __aligned(4) __packed;
+__CTASSERT(sizeof(struct lfs_dirheader64) == 12);
 
 union lfs_dirheader {
        struct lfs_dirheader64 u_64;
        struct lfs_dirheader32 u_32;
 };
+__CTASSERT(__alignof(union lfs_dirheader) == __alignof(struct lfs_dirheader64));
+__CTASSERT(__alignof(union lfs_dirheader) == __alignof(struct lfs_dirheader32));
 
 typedef union lfs_dirheader LFS_DIRHEADER;
 
@@ -381,6 +384,7 @@
        struct lfs_dirheader32  dotdot_header;
        char                    dotdot_name[4]; /* ditto */
 };
+__CTASSERT(sizeof(struct lfs_dirtemplate32) == 2*(8 + 4));
 
 struct lfs_dirtemplate64 {
        struct lfs_dirheader64  dot_header;
@@ -388,6 +392,7 @@
        struct lfs_dirheader64  dotdot_header;
        char                    dotdot_name[4]; /* ditto */
 };
+__CTASSERT(sizeof(struct lfs_dirtemplate64) == 2*(12 + 4));
 
 union lfs_dirtemplate {
        struct lfs_dirtemplate64 u_64;
@@ -408,6 +413,7 @@
        uint16_t        dotdot_namlen;
        char            dotdot_name[4]; /* ditto */
 };
+__CTASSERT(sizeof(struct lfs_odirtemplate) == 2*(8 + 4));
 #endif
 
 /*
@@ -441,6 +447,7 @@
        uint32_t        di_gid;         /* 116: File group. */
        uint64_t        di_modrev;      /* 120: i_modrev for NFSv4 */
 };
+__CTASSERT(sizeof(struct lfs32_dinode) == 128);
 
 struct lfs64_dinode {
        uint16_t        di_mode;        /*   0: IFMT, permissions; see below. */
@@ -469,11 +476,14 @@
        uint64_t        di_inumber;     /* 240: Inode number */
        uint64_t        di_spare[1];    /* 248: Reserved; currently unused */
 };
+__CTASSERT(sizeof(struct lfs64_dinode) == 256);
 
 union lfs_dinode {
        struct lfs64_dinode u_64;
        struct lfs32_dinode u_32;
 };
+__CTASSERT(__alignof(union lfs_dinode) == __alignof(struct lfs64_dinode));
+__CTASSERT(__alignof(union lfs_dinode) == __alignof(struct lfs32_dinode));
 
 /*
  * The di_db fields may be overlaid with other information for
@@ -529,6 +539,7 @@
        uint32_t su_flags;              /* 12: segment flags */
        uint64_t su_lastmod;            /* 16: last modified timestamp */
 };
+__CTASSERT(sizeof(struct segusage) == 24);
 
 typedef struct segusage_v1 SEGUSE_V1;
 struct segusage_v1 {
@@ -538,6 +549,7 @@
        uint16_t su_ninos;              /* 10: number of inode blocks in seg */
        uint32_t su_flags;              /* 12: segment flags  */
 };
+__CTASSERT(sizeof(struct segusage_v1) == 16);
 
 /*
  * On-disk file information.  One per file with data blocks in the segment.
@@ -554,7 +566,8 @@
        uint64_t fi_ino;                /* inode number */
        uint32_t fi_lastlength;         /* length of last block in array */
        uint32_t fi_pad;                /* unused */
-};
+} __aligned(4) __packed;
+__CTASSERT(sizeof(struct finfo64) == 24);
 
 typedef struct finfo32 FINFO32;
 struct finfo32 {
@@ -563,11 +576,14 @@
        uint32_t fi_ino;                /* inode number */
        uint32_t fi_lastlength;         /* length of last block in array */
 };
+__CTASSERT(sizeof(struct finfo32) == 16);
 
 typedef union finfo {
        struct finfo64 u_64;
        struct finfo32 u_32;
 } FINFO;
+__CTASSERT(__alignof(union finfo) == __alignof(struct finfo64));
+__CTASSERT(__alignof(union finfo) == __alignof(struct finfo32));
 
 /*
  * inode info (part of the segment summary)
@@ -579,16 +595,20 @@
 
 typedef struct iinfo64 {
        uint64_t ii_block;              /* block number */
-} IINFO64;
+} __aligned(4) __packed IINFO64;
+__CTASSERT(sizeof(struct iinfo64) == 8);
 
 typedef struct iinfo32 {
        uint32_t ii_block;              /* block number */
 } IINFO32;
+__CTASSERT(sizeof(struct iinfo32) == 4);
 
 typedef union iinfo {
        struct iinfo64 u_64;
        struct iinfo32 u_32;
 } IINFO;
+__CTASSERT(__alignof(union iinfo) == __alignof(struct iinfo64));
+__CTASSERT(__alignof(union iinfo) == __alignof(struct iinfo32));
 
 /*
  * Index file inode entries.
@@ -596,8 +616,9 @@
 
 /* magic value for daddrs */
 #define        LFS_UNUSED_DADDR        0       /* out-of-band daddr */
-/* magic value for if_nextfree */
-#define LFS_ORPHAN_NEXTFREE    (~(uint32_t)0) /* indicate orphaned file */
+/* magic value for if_nextfree -- indicate orphaned file */
+#define LFS_ORPHAN_NEXTFREE(fs) \
+       ((fs)->lfs_is64 ? ~(uint64_t)0 : ~(uint32_t)0)
 
 typedef struct ifile64 IFILE64;
 struct ifile64 {
@@ -606,7 +627,8 @@
        uint64_t if_atime_sec;          /* Last access time, seconds */
        int64_t   if_daddr;             /* inode disk address */
        uint64_t if_nextfree;           /* next-unallocated inode */
-};
+} __aligned(4) __packed;
+__CTASSERT(sizeof(struct ifile64) == 32);
 
 typedef struct ifile32 IFILE32;
 struct ifile32 {
@@ -616,6 +638,7 @@
        uint32_t if_atime_sec;          /* Last access time, seconds */
        uint32_t if_atime_nsec;         /* and nanoseconds */
 };
+__CTASSERT(sizeof(struct ifile32) == 20);
 
 typedef struct ifile_v1 IFILE_V1;
 struct ifile_v1 {
@@ -627,6 +650,7 @@
        struct timespec if_atime;       /* Last access time */
 #endif
 };
+__CTASSERT(sizeof(struct ifile_v1) == 12);
 
 /*
  * Note: struct ifile_v1 is often handled by accessing the first three
@@ -638,6 +662,9 @@
        struct ifile32 u_32;
        struct ifile_v1 u_v1;
 } IFILE;
+__CTASSERT(__alignof(union ifile) == __alignof(struct ifile64));
+__CTASSERT(__alignof(union ifile) == __alignof(struct ifile32));
+__CTASSERT(__alignof(union ifile) == __alignof(struct ifile_v1));
 
 /*
  * Cleaner information structure.  This resides in the ifile and is used
@@ -656,6 +683,7 @@
        uint32_t free_tail;             /* 20: tail of the inode free list */
        uint32_t flags;                 /* 24: status word from the kernel */
 } CLEANERINFO32;
+__CTASSERT(sizeof(struct _cleanerinfo32) == 28);
 
 typedef struct _cleanerinfo64 {
        uint32_t clean;                 /* 0: number of clean segments */
@@ -666,13 +694,16 @@
        uint64_t free_tail;             /* 32: tail of the inode free list */
        uint32_t flags;                 /* 40: status word from the kernel */
        uint32_t pad;                   /* 44: must be 64-bit aligned */
-} CLEANERINFO64;
+} __aligned(4) __packed CLEANERINFO64;
+__CTASSERT(sizeof(struct _cleanerinfo64) == 48);
 
 /* this must not go to disk directly of course */
 typedef union _cleanerinfo {
        CLEANERINFO32 u_32;
        CLEANERINFO64 u_64;
 } CLEANERINFO;
+__CTASSERT(__alignof(union _cleanerinfo) == __alignof(struct _cleanerinfo32));
+__CTASSERT(__alignof(union _cleanerinfo) == __alignof(struct _cleanerinfo64));
 
 /*
  * On-disk segment summary information
@@ -704,6 +735,7 @@
        uint16_t ss_pad;                /* 26: extra space */
        /* FINFO's and inode daddr's... */
 };
+__CTASSERT(sizeof(struct segsum_v1) == 28);
 



Home | Main Index | Thread Index | Old Index