Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/netbsd-6]: src Pull up following revision(s) (requested by manu in ticke...



details:   https://anonhg.NetBSD.org/src/rev/f472b791012c
branches:  netbsd-6
changeset: 774410:f472b791012c
user:      martin <martin%NetBSD.org@localhost>
date:      Sun Aug 12 13:13:20 2012 +0000

description:
Pull up following revision(s) (requested by manu in ticket #438):
        lib/libperfuse/perfuse_priv.h: revision 1.31
        sys/fs/puffs/puffs_msgif.h: revision 1.80
        sys/fs/puffs/puffs_vnops.c: revision 1.171
        lib/libpuffs/puffs_ops.3: revision 1.31
        sys/fs/puffs/puffs_vnops.c: revision 1.172
        sys/fs/puffs/puffs_vnops.c: revision 1.173
        sys/fs/puffs/puffs_vnops.c: revision 1.174
        usr.sbin/perfused/perfused.c: revision 1.24
        sys/fs/puffs/puffs_sys.h: revision 1.80
        sys/fs/puffs/puffs_sys.h: revision 1.81
        sys/fs/puffs/puffs_sys.h: revision 1.82
        lib/libperfuse/subr.c: revision 1.19
        lib/libperfuse/perfuse.c: revision 1.30
        sys/fs/puffs/puffs_msgif.c: revision 1.90
        sys/fs/puffs/puffs_msgif.c: revision 1.91
        sys/fs/puffs/puffs_msgif.c: revision 1.92
        lib/libperfuse/ops.c: revision 1.59
        lib/libpuffs/puffs.3: revision 1.53
        lib/libperfuse/debug.c: revision 1.12
        lib/libpuffs/puffs.3: revision 1.54
        sys/fs/puffs/puffs_vnops.c: revision 1.167
        sys/fs/puffs/puffs_msgif.h: revision 1.79
        usr.sbin/perfused/msg.c: revision 1.21
        sys/fs/puffs/puffs_vfsops.c: revision 1.102
        sys/fs/puffs/puffs_vfsops.c: revision 1.103
        sys/fs/puffs/puffs_vfsops.c: revision 1.105
        lib/libpuffs/puffs.h: revision 1.123
        lib/libperfuse/perfuse_if.h: revision 1.20
        lib/libperfuse/perfuse.c: revision 1.29
        lib/libpuffs/dispatcher.c: revision 1.42
        lib/libpuffs/dispatcher.c: revision 1.43
- Fix same vnodes associated with multiple cookies
The scheme used to retreive known nodes on lookup was flawed, as it only
used parent and name. This produced a different cookie for the same file
if it was renamed, when looking up ../ or when dealing with multiple files
associated with the same name through link(2).
We therefore abandon the use of node name and introduce hashed lists of
inodes. This causes a huge rewrite of reclaim code, which do not attempt
to keep parents allocated until all their children are reclaimed
- Fix race conditions in reclaim
There are a few situations where we issue multiple FUSE operations for
a PUFFS operation. On reclaim, we therefore have to wait for all FUSE
operation to complete, not just the current exchanges. We do this by
introducing node reference count with node_ref() and node_rele().
- Detect data loss caused by FAF
VOP_PUTPAGES causes FAF writes where the kernel does not check the
operation result. At least issue a warning on error.
- Enjoy FAF shortcut on setattr
No need to wait for the result if the kernel does not want it. There is
however an exception for setattr that touch the size, we need to wait
for completion because we have other operations queued for after the
resize.
- Fix fchmod() on write-open file
fchmod() on a node open with write privilege will send setattr with both mode
and size set. This confuses some FUSE filesystem. Therefore we send two FUSE
operations, one for mode, and one for size.
- Remove node TTL handling for netbsd-5 for simplicity sake. The code
still builds on netbsd-5 but does not have the node TTL feature anymore.
It works fine with kernel support on netbsd-6.
- Improve PUFFS_KFLAG_CACHE_FS_TTL by reclaiming older inactive nodes.
The normal kernel behavior is to retain inactive nodes in the freelist
until it runs out of vnodes. This has some merit for local filesystems,
where the cost of an allocation is about the same as the cost of a
lookup. But that situation is not true for distributed filesystems.
On the other hand, keeping inactive nodes for a long time hold memory
in the file server process, and when the kernel runs out of vnodes, it
produce reclaim avalanches that increase lattency for other operations.
We do not reclaim inactive vnodes immediatly either, as they may be
looked up again shortly. Instead we introduce a grace time and we
reclaim nodes that have been inactive beyond the grace time.
- Fix lookup/reclaim race condition.
The above improvement undercovered a race condition between lookup and
reclaim. If we reclaimed a vnode associated with a userland cookie while
a lookup returning that same cookiewas inprogress, then the kernel ends
up with a vnode associated with a cookie that has been reclaimed in
userland. Next operation on the cookie will crash (or at least confuse)
the filesystem.
We fix this by introducing a lookup count in kernel and userland. On
reclaim, the kernel sends the count, which enable userland to detect
situation where it initiated a lookup that is not completed in kernel.
In such a situation, the reclaim must be ignored, as the node is about
to be looked up again.
Fix hang unmount bug introduced by last commit.
We introduced a slow queue for delayed reclaims, while the existing
queue for unmount, flush and exist has been renamed fast queue. Both
queues had timestamp for when an operation should be done, but it was
useless for the fast queue, which is always used to run an operation
ASAP. And the timestamp test had an error that turned ASAP into "at next
tick", but nobody what there to wake the thread at next tick, hence
the hang. The fix is to remove the useless and buggy timestamp test for
fast queue.
Rename slow sopreq queue into node sopreq queue, to refet the fact that
is only intended for postponed node reclaims.
When purging the node sopreq queue, do not call puffs_msg_sendresp(), as
it makes no sense.
Fix race condition between (create|mknod|mkdir|symlino) and reclaim, just
like we did it between lookup and reclaim.
Missing bit in previous commit (prevent race between create|mknod|mkdir|symlink
and reclaim)
Bump date for previous.
New sentence, new line; remove trailing whitespace; fix typos;
punctuation nits.
Add PUFFS_KFLAG_CACHE_DOTDOT so that vnodes hold a reference on their
parent, keeping them active, and allowing to lookup .. without sending
a request to the filesystem.
Enable the featuure for perfused, as this is how FUSE works.
Missing bit in previous commit (PUFFS_KFLAG_CACHE_DOTDOT option to avoid
looking up ..)

diffstat:

 lib/libperfuse/debug.c        |    14 +-
 lib/libperfuse/ops.c          |  1154 +++++++++++++++++++++-------------------
 lib/libperfuse/perfuse.c      |    72 ++-
 lib/libperfuse/perfuse_if.h   |     9 +-
 lib/libperfuse/perfuse_priv.h |    46 +-
 lib/libperfuse/subr.c         |   134 +++-
 lib/libpuffs/dispatcher.c     |    60 +-
 lib/libpuffs/puffs.3          |    18 +-
 lib/libpuffs/puffs.h          |     3 +-
 lib/libpuffs/puffs_ops.3      |    36 +-
 sys/fs/puffs/puffs_msgif.c    |    81 ++-
 sys/fs/puffs/puffs_msgif.h    |    25 +-
 sys/fs/puffs/puffs_sys.h      |    26 +-
 sys/fs/puffs/puffs_vfsops.c   |    10 +-
 sys/fs/puffs/puffs_vnops.c    |   232 ++++++-
 usr.sbin/perfused/msg.c       |     9 +-
 usr.sbin/perfused/perfused.c  |     9 +-
 17 files changed, 1205 insertions(+), 733 deletions(-)

diffs (truncated from 3911 to 300 lines):

diff -r 63de0174dbd3 -r f472b791012c lib/libperfuse/debug.c
--- a/lib/libperfuse/debug.c    Sun Aug 12 13:02:59 2012 +0000
+++ b/lib/libperfuse/debug.c    Sun Aug 12 13:13:20 2012 +0000
@@ -1,4 +1,4 @@
-/*  $NetBSD: debug.c,v 1.10.2.1 2012/04/23 16:48:59 riz Exp $ */
+/*  $NetBSD: debug.c,v 1.10.2.2 2012/08/12 13:13:20 martin Exp $ */
 
 /*-
  *  Copyright (c) 2010 Emmanuel Dreyfus. All rights reserved.
@@ -90,7 +90,8 @@
        "WRITE",
        "AFTERWRITE",
        "OPEN",
-       "AFTERXCHG"
+       "AFTERXCHG",
+       "REF"
 };
 
 const char *
@@ -146,7 +147,7 @@
                (void)strcpy(pt->pt_path, "");
        else
                (void)strlcpy(pt->pt_path, 
-                             perfuse_node_path(opc),
+                             perfuse_node_path(ps, opc),
                              sizeof(pt->pt_path));
 
        (void)strlcpy(pt->pt_extra,
@@ -198,7 +199,7 @@
        ps = puffs_getspecific(pu);
 
        (void)ftruncate(fileno(fp), 0);
-       (void)fseek(fp, 0, SEEK_SET);
+       (void)fseek(fp, 0UL, SEEK_SET);
 
        (void)memset(&ts_min, 0, sizeof(ts_min));
        (void)memset(&ts_max, 0, sizeof(ts_max));
@@ -265,6 +266,11 @@
                        (long)(avg % 1000000000L),
                        (long long)ts_max[i].tv_sec, ts_max[i].tv_nsec);
        }       
+
+       fprintf(fp, "\n\nGlobal statistics\n");
+       fprintf(fp, "Nodes: %d\n", ps->ps_nodecount);
+       fprintf(fp, "Exchanges: %d\n", ps->ps_xchgcount);
+       fprintf(fp, "Nodes possibly leaked: %d\n", ps->ps_nodeleakcount);
        
        (void)fflush(fp);
        return;
diff -r 63de0174dbd3 -r f472b791012c lib/libperfuse/ops.c
--- a/lib/libperfuse/ops.c      Sun Aug 12 13:02:59 2012 +0000
+++ b/lib/libperfuse/ops.c      Sun Aug 12 13:13:20 2012 +0000
@@ -1,4 +1,4 @@
-/*  $NetBSD: ops.c,v 1.50.2.5 2012/07/05 17:26:14 riz Exp $ */
+/*  $NetBSD: ops.c,v 1.50.2.6 2012/08/12 13:13:20 martin Exp $ */
 
 /*-
  *  Copyright (c) 2010-2011 Emmanuel Dreyfus. All rights reserved.
@@ -34,7 +34,7 @@
 #include <sysexits.h>
 #include <syslog.h>
 #include <puffs.h>
-#include <sys/cdefs.h>
+#include <sys/socket.h>
 #include <sys/socket.h>
 #include <sys/extattr.h>
 #include <sys/time.h>
@@ -50,21 +50,15 @@
 #endif
 #ifdef PUFFS_KFLAG_CACHE_FS_TTL
 static void perfuse_newinfo_setttl(struct puffs_newinfo *, 
-    struct fuse_entry_out *, struct fuse_attr_out *);
-#else /* PUFFS_KFLAG_CACHE_FS_TTL */
-static void set_expire(puffs_cookie_t, struct fuse_entry_out *, 
-    struct fuse_attr_out *);
-static int attr_expired(puffs_cookie_t);
-static int entry_expired(puffs_cookie_t);
+    struct puffs_node *, struct fuse_entry_out *, struct fuse_attr_out *);
 #endif /* PUFFS_KFLAG_CACHE_FS_TTL */
 static int xchg_msg(struct puffs_usermount *, puffs_cookie_t, 
     perfuse_msg_t *, size_t, enum perfuse_xchg_pb_reply); 
 static int mode_access(puffs_cookie_t, const struct puffs_cred *, mode_t);
-static int sticky_access(struct puffs_node *, const struct puffs_cred *);
+static int sticky_access(puffs_cookie_t, struct puffs_node *, 
+    const struct puffs_cred *);
 static void fuse_attr_to_vap(struct perfuse_state *,
     struct vattr *, struct fuse_attr *);
-static int node_lookup_dir_nodot(struct puffs_usermount *,
-    puffs_cookie_t, char *, size_t, struct puffs_node **);
 static int node_lookup_common(struct puffs_usermount *, puffs_cookie_t, 
     struct puffs_newinfo *, const char *, const struct puffs_cred *, 
     struct puffs_node **);
@@ -73,12 +67,13 @@
 static uint64_t readdir_last_cookie(struct fuse_dirent *, size_t); 
 static ssize_t fuse_to_dirent(struct puffs_usermount *, puffs_cookie_t,
     struct fuse_dirent *, size_t);
-static int readdir_buffered(puffs_cookie_t, struct dirent *, off_t *, 
+static void readdir_buffered(puffs_cookie_t, struct dirent *, off_t *, 
     size_t *);
+static void node_ref(puffs_cookie_t);
+static void node_rele(puffs_cookie_t);
 static void requeue_request(struct puffs_usermount *, 
     puffs_cookie_t opc, enum perfuse_qtype);
-static int dequeue_requests(struct perfuse_state *, 
-    puffs_cookie_t opc, enum perfuse_qtype, int);
+static int dequeue_requests(puffs_cookie_t opc, enum perfuse_qtype, int);
 #define DEQUEUE_ALL 0
 
 /* 
@@ -213,12 +208,13 @@
 #ifdef PERFUSE_DEBUG
        if ((perfuse_diagflags & PDF_FILENAME) && (opc != 0))
                DPRINTF("file = \"%s\", ino = %"PRIu64" flags = 0x%x\n", 
-                       perfuse_node_path(opc), 
+                       perfuse_node_path(ps, opc), 
                        ((struct puffs_node *)opc)->pn_va.va_fileid,
                        PERFUSE_NODE_DATA(opc)->pnd_flags);
 #endif
+       ps->ps_xchgcount++;
        if (pnd)
-               pnd->pnd_flags |= PND_INXCHG;
+               pnd->pnd_inxchg++;
 
        /*
         * Record FUSE call start if requested
@@ -238,9 +234,10 @@
        if (pt != NULL)
                perfuse_trace_end(ps, pt, error);
 
+       ps->ps_xchgcount--;
        if (pnd) {
-               pnd->pnd_flags &= ~PND_INXCHG;
-               (void)dequeue_requests(ps, opc, PCQ_AFTERXCHG, DEQUEUE_ALL);
+               pnd->pnd_inxchg--;
+               (void)dequeue_requests(opc, PCQ_AFTERXCHG, DEQUEUE_ALL);
        }
 
        return error;
@@ -268,14 +265,12 @@
 }
 
 static int 
-sticky_access(struct puffs_node *targ, const struct puffs_cred *pcr)
+sticky_access(puffs_cookie_t opc, struct puffs_node *targ,
+             const struct puffs_cred *pcr)
 {
        uid_t uid;
-       struct puffs_node *tdir;
        int sticky, owner;
 
-       tdir = PERFUSE_NODE_DATA(targ)->pnd_parent;
-
        /*
         * This covers the case where the kernel requests a DELETE
         * or RENAME on its own, and where puffs_cred_getuid would 
@@ -291,7 +286,7 @@
        if (puffs_cred_getuid(pcr, &uid) != 0)
                DERRX(EX_SOFTWARE, "puffs_cred_getuid fails in %s", __func__);
 
-       sticky = puffs_pn_getvap(tdir)->va_mode & S_ISTXT;
+       sticky = puffs_pn_getvap(opc)->va_mode & S_ISTXT;
        owner = puffs_pn_getvap(targ)->va_uid == uid;
 
        if (sticky && !owner)
@@ -339,8 +334,10 @@
 }
 
 #ifdef PUFFS_KFLAG_CACHE_FS_TTL
-static void perfuse_newinfo_setttl(struct puffs_newinfo *pni, 
-       struct fuse_entry_out *feo, struct fuse_attr_out *fao)
+static void 
+perfuse_newinfo_setttl(struct puffs_newinfo *pni,
+    struct puffs_node *pn, struct fuse_entry_out *feo,
+    struct fuse_attr_out *fao)
 {
 #ifdef PERFUSE_DEBUG
        if ((feo == NULL) && (fao == NULL))
@@ -362,6 +359,8 @@
        if (feo != NULL) {
                struct timespec va_ttl;
                struct timespec cn_ttl;
+               struct timespec now;
+               struct perfuse_node_data *pnd = PERFUSE_NODE_DATA(pn);
 
                va_ttl.tv_sec = feo->attr_valid;
                va_ttl.tv_nsec = feo->attr_valid_nsec;
@@ -370,114 +369,17 @@
 
                puffs_newinfo_setvattl(pni, &va_ttl);
                puffs_newinfo_setcnttl(pni, &cn_ttl);
+        
+               if (clock_gettime(CLOCK_REALTIME, &now) != 0)
+                       DERR(EX_OSERR, "clock_gettime failed"); 
+
+                timespecadd(&now, &cn_ttl, &pnd->pnd_cn_expire);
        }
 
        return; 
 }
-#else /* PUFFS_KFLAG_CACHE_FS_TTL */
-static void 
-set_expire(puffs_cookie_t opc, struct fuse_entry_out *feo,
-          struct fuse_attr_out *fao)
-{
-       struct puffs_node *pn = (struct puffs_node *)opc;
-       struct perfuse_node_data *pnd = PERFUSE_NODE_DATA(opc);
-       struct timespec entry_ts;
-       struct timespec attr_ts;
-       struct timespec now;
-
-       if (clock_gettime(CLOCK_REALTIME, &now) != 0)
-               DERR(EX_OSERR, "clock_gettime failed");
-
-       if ((feo == NULL) && (fao == NULL))
-               DERRX(EX_SOFTWARE, "%s: feo and fao NULL", __func__);
-
-       if ((feo != NULL) && (fao != NULL))
-               DERRX(EX_SOFTWARE, "%s: feo and fao != NULL", __func__);
-
-       if (feo != NULL) {
-               entry_ts.tv_sec = (time_t)feo->entry_valid;
-               entry_ts.tv_nsec = (long)feo->entry_valid_nsec;
-
-               timespecadd(&now, &entry_ts, &pnd->pnd_entry_expire);
-
-               attr_ts.tv_sec = (time_t)feo->attr_valid;
-               attr_ts.tv_nsec = (long)feo->attr_valid_nsec;
-
-               timespecadd(&now, &attr_ts, &pnd->pnd_attr_expire);
-       } 
-
-       if (fao != NULL) {
-               attr_ts.tv_sec = (time_t)fao->attr_valid;
-               attr_ts.tv_nsec = (long)fao->attr_valid_nsec;
-
-               timespecadd(&now, &attr_ts, &pnd->pnd_attr_expire);
-       } 
-
-       return;
-}
-
-static int
-attr_expired(puffs_cookie_t opc)
-{
-       struct perfuse_node_data *pnd;
-       struct timespec expire;
-       struct timespec now;
-
-       pnd = PERFUSE_NODE_DATA(opc);
-       expire = pnd->pnd_attr_expire;
-
-       if (clock_gettime(CLOCK_REALTIME, &now) != 0)
-               DERR(EX_OSERR, "clock_gettime failed");
-
-       return timespeccmp(&expire, &now, <);
-}
-
-static int
-entry_expired(puffs_cookie_t opc)
-{
-       struct perfuse_node_data *pnd;
-       struct timespec expire;
-       struct timespec now;
-
-       pnd = PERFUSE_NODE_DATA(opc);
-       expire = pnd->pnd_entry_expire;
-
-       if (clock_gettime(CLOCK_REALTIME, &now) != 0)
-               DERR(EX_OSERR, "clock_gettime failed");
-
-       return timespeccmp(&expire, &now, <);
-}
 #endif /* PUFFS_KFLAG_CACHE_FS_TTL */
 
-
-/* 
- * Lookup name in directory opc
- * We take special care of name being . or ..
- * These are returned by readdir and deserve tweaks.
- */
-static int
-node_lookup_dir_nodot(struct puffs_usermount *pu, puffs_cookie_t opc,
-       char *name, size_t namelen, struct puffs_node **pnp)
-{
-       /*
-        * "dot" is easy as we already know it
-        */
-       if (strncmp(name, ".", namelen) == 0) {
-               *pnp = (struct puffs_node *)opc;
-               return 0;
-       }
-
-       /*
-        * "dotdot" is also known
-        */
-       if (strncmp(name, "..", namelen) == 0) {
-               *pnp = PERFUSE_NODE_DATA(opc)->pnd_parent;
-               return 0;
-       }
-



Home | Main Index | Thread Index | Old Index