tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

sys/dev/dksubr.c may reorder requests



Hello,
sys/dev/dksubr.c:dk_start() has this code:
        while ((bp = bufq_get(dksc->sc_bufq)) != NULL) {
                if (di->di_diskstart(dksc, bp) != 0) {
                        bufq_put(dksc->sc_bufq, bp);
                        break;
                }
        }

This potentially causes a buffer to be dequeued and requeued.
Even with disksort strategy, it seems that the order is not preserved.
I noticed this with xbd(4) (xen's virtual disk driver), where its start routine
would get out of order requests as the pagedaemon (or the filesystem) generate
properly ordered writes. The attached patch fixes this issue, and with
on a Xen guest I can see a dd if=/dev/zero of=file bs=1m count=1000 bump from
less to 30MB/s to more than 100MB/s. This is likely because the underlying
disk has a strategy to detect sequential writes, which fails with the
reordering caused by dk_start().

It looks like xbd(4) is the only user of sys/dev/dksubr.c, which may be why
is has not been noticed earlier.

Any objection to the attached patch ?

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--
Index: dksubr.c
===================================================================
RCS file: /cvsroot/src/sys/dev/dksubr.c,v
retrieving revision 1.42
diff -u -p -u -r1.42 dksubr.c
--- dksubr.c    19 Nov 2010 06:44:39 -0000      1.42
+++ dksubr.c    19 May 2014 18:12:44 -0000
@@ -234,10 +240,21 @@ dk_start(struct dk_intf *di, struct dk_s
 
        DPRINTF_FOLLOW(("dk_start(%s, %p)\n", di->di_dkname, dksc));
 
+       /* first see if there's a buf waiting for ressources */
+       if (dksc->sc_nextbufp != NULL) {
+               if (di->di_diskstart(dksc, dksc->sc_nextbufp) != 0) {
+                       /* still no ressources, just return */
+                       return;
+               }
+       }
+       dksc->sc_nextbufp = NULL; /* it has been queued now */
+
        /* Process the work queue */
        while ((bp = bufq_get(dksc->sc_bufq)) != NULL) {
                if (di->di_diskstart(dksc, bp) != 0) {
-                       bufq_put(dksc->sc_bufq, bp);
+                       KASSERT(dksc->sc_nextbufp == NULL);
+                       /* try bp again later */
+                       dksc->sc_nextbufp = bp;
                        break;
                }
        }
Index: dkvar.h
===================================================================
RCS file: /cvsroot/src/sys/dev/dkvar.h,v
retrieving revision 1.15
diff -u -p -u -r1.15 dkvar.h
--- dkvar.h     19 Nov 2010 06:44:39 -0000      1.15
+++ dkvar.h     19 May 2014 18:12:44 -0000
@@ -56,6 +56,7 @@ struct dk_softc {
        char                     sc_xname[DK_XNAME_SIZE]; /* external name */
        struct disk              sc_dkdev;      /* generic disk info */
        struct bufq_state       *sc_bufq;       /* buffer queue */
+       struct buf              *sc_nextbufp;   /* buf waiting for ressources */
 };
 
 /* sc_flags:


Home | Main Index | Thread Index | Old Index