tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bmap/strategy for vnd backing file



manu%netbsd.org@localhost (Emmanuel Dreyfus) writes:

>Yes, I meant "the former". As I understand, using read/write will in the
>end use bmap/strategy but with the overhead of more layers, is that
>right?

Yes, but the layers are only a small issue.

The read/write path calls VOP_PUTPAGES with several hints after calling
vn_rdwr. The result is that vn_rdwr() performs regular file I/O with
some amount of read-ahead, and then the cached pages are flushed, and in
case of writes these are flushed synchronously. The CPU overhead is
also not neglible.

This is done for two reasons: First to avoid double-caching the data
by the underlying filesystem in addition to a filesystem mounted on
top of vnd. Second to avoid overloading the pagermap which may end
in deadlocks between both layers.

This patch:

@@ -808,12 +830,15 @@
        bp->b_error =
            vn_rdwr(doread ? UIO_READ : UIO_WRITE,
            vp, bp->b_data, len, offset, UIO_SYSSPACE,
-           IO_ADV_ENCODE(POSIX_FADV_NOREUSE), vnd->sc_cred, &resid, NULL);
+           IO_ADV_ENCODE(POSIX_FADV_NORMAL) | IO_DIRECT,
+           vnd->sc_cred, &resid, NULL);
        bp->b_resid = resid;
 
+#if 0
        mutex_enter(vp->v_interlock);
        (void) VOP_PUTPAGES(vp, 0, 0,
            PGO_ALLPAGES | PGO_CLEANIT | PGO_FREE | PGO_SYNCIO);
+#endif

makes that part perform like regular file I/O, but of course doesn't address
both issues. IO_DIRECT was supposed to help, but I don't see an effect.


N.B. IO_ADV_ENCODE() passes only 2 bits of advice while the POSIX flags
occupy 3, POSIX_FADV_NOREUSE therefore aliases to POSIX_FADV_RANDOM.
I tend to believe that this is a bug, on the other hand we ignore the
flags mostly.


As to the other problem. I changed the code to get the sector size
from the underlying device (we already query it to identify sparse
files). This fixes the problem that getdisksize() can't get data from
VREG vnodes. I also made this a request-by-request decision so that
the bmap/strategy path is used when a large request allows this.
Most filesystem I/O is sized and aligned good enough to use bmap/strategy
in spite of incompatible fundamental I/O sizes.

-- 
-- 
                                Michael van Elst
Internet: mlelstv%serpens.de@localhost
                                "A potential Snark may lurk in every tree."


Home | Main Index | Thread Index | Old Index