NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/45129: Write handling in puffs(4) broken
>Number: 45129
>Category: kern
>Synopsis: Write handling in puffs(4) broken
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Jul 09 23:30:00 +0000 2011
>Originator: tron%zhadum.org.uk@localhost
>Release: NetBSD 5.99.54 2011-07-09 sources
>Organization:
Matthias Scheler http://zhadum.org.uk/
>Environment:
System: NetBSD lyssa.zhadum.org.uk 5.99.54 NetBSD 5.99.54 (LYSSA) #1: Sat Jul 9
20:32:01 BST 2011 tron%lyssa.zhadum.org.uk@localhost:/src/sys/compile/LYSSA i386
Architecture: i386
Machine: i386
>Description:
Writing to "fuse-ext2" ("pkgsrc/filesystems/fuse-ext2") works fine
under NetBSD/amd64 5.1_STABLE but fails with a "Protocol Error"
under NetBSD/i385 5.99.54.
Using debugging output I found out that the EPROTO error is return by this
bit of code in puffs_vnop_write() in "src/sys/fs/puffs/puffs_vnops.c":
if (write_msg->pvnr_resid > tomove) {
puffs_senderr(pmp, PUFFS_ERR_WRITE,
E2BIG, "resid grew", VPTOPNC(ap->a_vp));
error = EPROTO;
break;
}
With more debugging output I found out that write_msg->pvnr_resid
gets clobbered by this call:
PUFFS_MSG_ENQUEUEWAIT2(pmp, park_write, vp->v_data,
NULL, error);
PUFFS_MSG_ENQUEUEWAIT2() is defined as follows:
#define PUFFS_MSG_ENQUEUEWAIT2(pmp, park, vp1, vp2, var) \
do { \
puffs_msg_enqueue(pmp, park); \
var = puffs_msg_wait2(pmp, park, vp1, vp2); \
} while (/*CONSTCOND*/0)
I've changed the code in puffs_vnop_write() to call both functions
seperately and found out that it is puffs_msg_wait2() which clobbers
write_msg->pvnr_resid. puffs_msg_wait2() is relatively short
function that calls puffs_msg_wait() and never writes to park_write->preq
(which points to write_msg). So I guess the bug is in puffs_msg_wait().
Here are the difference to this function between "netbsd-5" (which
works) and HEAD (which doesn't work):
--- old 2011-07-10 00:10:04.000000000 +0100
+++ new 2011-07-10 00:09:43.000000000 +0100
@@ -1,23 +1,35 @@
int
puffs_msg_wait(struct puffs_mount *pmp, struct puffs_msgpark *park)
{
+ lwp_t *l = curlwp;
+ proc_t *p = l->l_proc;
struct puffs_req *preq = park->park_preq; /* XXX: hmmm */
- struct mount *mp = PMPTOMP(pmp);
+ sigset_t ss;
+ sigset_t oss;
int error = 0;
int rv;
+ /*
+ * block unimportant signals.
+ *
+ * The set of "important" signals here was chosen to be same as
+ * nfs interruptible mount.
+ */
+ sigfillset(&ss);
+ sigdelset(&ss, SIGINT);
+ sigdelset(&ss, SIGTERM);
+ sigdelset(&ss, SIGKILL);
+ sigdelset(&ss, SIGHUP);
+ sigdelset(&ss, SIGQUIT);
+ mutex_enter(p->p_lock);
+ sigprocmask1(l, SIG_BLOCK, &ss, &oss);
+ mutex_exit(p->p_lock);
+
mutex_enter(&pmp->pmp_lock);
puffs_mp_reference(pmp);
mutex_exit(&pmp->pmp_lock);
mutex_enter(&park->park_mtx);
- if ((park->park_flags & PARKFLAG_WANTREPLY) == 0
- || (park->park_flags & PARKFLAG_CALL)) {
- mutex_exit(&park->park_mtx);
- rv = 0;
- goto skipwait;
- }
-
/* did the response beat us to the wait? */
if (__predict_false((park->park_flags & PARKFLAG_DONE)
|| (park->park_flags & PARKFLAG_HASERROR))) {
@@ -26,6 +38,13 @@
goto skipwait;
}
+ if ((park->park_flags & PARKFLAG_WANTREPLY) == 0
+ || (park->park_flags & PARKFLAG_CALL)) {
+ mutex_exit(&park->park_mtx);
+ rv = 0;
+ goto skipwait;
+ }
+
error = cv_wait_sig(&park->park_cv, &park->park_mtx);
DPRINTF(("puffs_touser: waiter for %p woke up with %d\n",
park, error));
@@ -74,24 +93,14 @@
mutex_exit(&park->park_mtx);
}
- /*
- * retake the lock and release. This makes sure (haha,
- * I'm humorous) that we don't process the same vnode in
- * multiple threads due to the locks hacks we have in
- * puffs_lock(). In reality this is well protected by
- * the biglock, but once that's gone, well, hopefully
- * this will be fixed for real. (and when you read this
- * comment in 2017 and subsequently barf, my condolences ;).
- */
- if (rv == 0 && !fstrans_is_owner(mp)) {
- fstrans_start(mp, FSTRANS_NORMAL);
- fstrans_done(mp);
- }
-
skipwait:
mutex_enter(&pmp->pmp_lock);
puffs_mp_release(pmp);
mutex_exit(&pmp->pmp_lock);
+ mutex_enter(p->p_lock);
+ sigprocmask1(l, SIG_SETMASK, &oss, NULL);
+ mutex_exit(p->p_lock);
+
return rv;
}
I can unfortunately not find the bug in it.
>How-To-Repeat:
1.) Mount "fuse-ext2".
2.) Try to copy a file to it. You'll get a "Protocol Error" message
from "cp".
>Fix:
Not known.
Home |
Main Index |
Thread Index |
Old Index