Subject: patch. Re: CompactFLash performance improvement
To: None <tech-kern@NetBSD.org>
From: Artem Belevich <art@riverstonenet.com>
List: tech-kern
Date: 03/01/2004 13:01:48
--fdj2RfSjLxBAspz7
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

I've got the driver to do what I wanted.

The end result is pretty good:  3-10x faster for writes. ~3x for reads. 

I've attached the diff and opened a PR kern/24633.

Here's a more detailed list of benchmarks using iozone:
iozone -s 16m -r 4k -B -e -c -o -f /mnt3a/iozone.dat -U /mnt3a

                                              rnd   rnd bkwd  record stride 
                  write rewrite  read reread read write read rewrite read 
Was:
 Transcend 45x     1088     716  2128   2126 2184   451 1996  250807 2086                  
 SimpleTech Pro    1016     636  1722   1722 1779   382 1646  257942 1701 
 Sandisk Ultra II  1131     677  1710   1710 1765   356 1622  324070 1669                  
 SimpleTech/OEM     216     175   909    911  952   261  909  376867  909 

Now:
 Sandisk Ultra II  5356    2573  5248   5248 5285   537 4726 1643022 5060
 Transcend 45x     4975    2502  5259   5258 5360   683 4893 1615787 5108
 SimpleTech Pro    3578    1921  4325   4324 4420   537 4058  351287 4221
 SimpleTech/OEM    2591    1412  3176   3176 3276  1321 3057 1186478 3128

--Artem

On Fri, Feb 27, 2004 at 05:10:45PM -0800, Artem Belevich <art@riverstonenet.com> wrote:
> Some time back I've whined about painfully slow I/O on CompactFlash
> cards in IDE mode. Looks like it could be improved.
> 
> To recap, under NetBSD/FreeBSD compactflash operations are several
> times slower than they are under Windows or Linux.
> 
> Recently I've spent some time tracking the problem and it looks like
> with some driver changes it would be possible to speed it up.
> 
> The issue is that:
> a) most CompactFlash cards claim to be capable of 1-sector PIO0
>    transfers only. I think it was a requirement in the CF+ spec before
>    rev1.4.
> b) CF cards have fairly long command-to-DRQ delay. One of Sandisk CF datasheets
>    says that it can be up to ~1.2ms. My measurements show the number pretty close 
>    to that.
> c) NetBSD driver issues a read command for *every* sector.
> 
> Here's a timeline for reading a single sector during 'dd bs=8192 count=<a lot>'.
> 
> 	     0   issue read command with sector count=1
> 	+~1300us interrupt happens  (first command takes long)
> 	+~ 250us fetch the data from the disk
> 
>         while(have sectors to read){
> 	    0	issue read command with sector count=1
> 	+~300us interrupt happens
> 	+~250us fetch the data from the disk
> 
> 	+  20us issue next command
> 	}
> 
> All in all it's ~570us/sector or <1Mb/s.
> 
> I've experimented a bit with ATA drivers I've got from
> http://www.ata-atapi.com/drvr.htm and with the same CF card I'm able
> to achieve ~3.3Mb/s read rate compared to ~900K/s I've got from the
> same card with NetBSD.
> 
> The main difference between ata-atapi.com's code and NetBSD is that
> NetBSD issues read command with 1-sector count and ata-atapi.com's
> driver reads several sectors at a time (32 in my case). 
> 
> The timings in this case looks like this:
> 
> 	     0   issue read command with sector count=N
> 	+~1300us interrupt happens  (first command takes long)
> 	+~ 250us fetch the data from the disk
> 
> 	while (more sectors) {
> 	         wait for next interrupt
> 	+   11us interrupt happens  (*** only 11us until CF delivers next sector.
> 	+  250us fetch data
> 	}
> 
> In this case it's ~260us/sector or ~2Mb/s.
> 
> So, if we could change NetBSD driver to issue single read command for
> N secors at a time, we'd be able to speed-up CF I/O quite a bit.  
> 
> Note that I'm talking about regular read/write commans (0x20/0x30),
> not read-multiple/write-multiple commands (0xc4/0xc5). I
> 
> Now, I'm not an expert on ATA/IDE and I'd appreciate if someone more
> familiar with the subject could take a look at the issue at see if and
> how it can be dealt with.
> 
> I wonder if the issue merits opening a PR for it.
> 
> --Artem
> 
> On Sun, Jul 14, 2002 at 07:26:00AM +0200, Wojciech Puchar <wojtek@chylonia.3miasto.net> wrote:
> > > Not the one(s) I have.
> > >
> > > wd1 at wdc0 channel 0 drive 0: <Hitachi CVM1.3.3>
> > > wd1: drive supports 1-sector PIO transfers, LBA addressing
> > > wd1: 244 MB, 695 cyl, 15 head, 48 sec, 512 bytes/sect x 500400 sectors
> > >
> > > So far all the CompactFlash cards I saw (sandisk/lexar/io data/several
> > > no-names) reported themselves as PIO 0 with 1-sector-at-atime transfer
> > > capability. :-( I believe, that's what one of the earlier CF specs
> > > insisted upon.
> > >
> > > I saw some references to the multisector-capable CF cards on the web,
> > > but I haven't seen any of them myself yet.
> > 
> > i think it's possible they really can multisector but don't report. and
> > with >1sec/s will be same 3000us delay
> > 
> > 
> > >
> > > --Artem
> > >
> > > On Sat, Jul 13, 2002 at 08:27:23AM +0200, Wojciech Puchar <wojtek@chylonia.3miasto.net> wrote:
> > > > > snapshot for both windows and NetBSD. This should at least give us a
> > > > > hint why are we so slow.
> > > > >
> > > > > What I see is that single-sector read consists of
> > > > >
> > > > > 	command + 275us wait time + 150us data transfer time = 430us.
> > > > >
> > > > > Write, on the other hand, looks a bit different:
> > > > > 	command+150us data transfer+2850us wait time = 3000us
> > > > >
> > > > aren't these IDE-flash devices capable of IDE multisector?
> > > >
> > >
> > 
> > --------------------------------------------------------------------
> > Charakterystycznymi cechami rozwoju oprogramowania jest wyk?adniczy
> > wzrost wymaga? sprz?towych, kwadratowy wzrost ilo?ci b??d?w, liniowy
> > wzrost ilo?ci bajer?w przy mniej ni? liniowym wzro?cie funkcjonalno?ci
> > 
> 

--fdj2RfSjLxBAspz7
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="ata.diff"

Index: dev/ata/wdvar.h
===================================================================
diff -u -p -u -r1.4 wdvar.h
--- dev/ata/wdvar.h	21 Sep 2002 03:34:47 -0000	1.4
+++ dev/ata/wdvar.h	1 Mar 2004 20:29:13 -0000
@@ -42,6 +42,7 @@ struct ata_bio {
 #define	ATA_READ	0x0020	/* transfer is a read (otherwise a write) */
 #define	ATA_CORR	0x0040	/* transfer had a corrected error */
 #define	ATA_LBA48	0x0080	/* transfer uses 48-bit LBA adressing */
+#define ATA_MULTI   0x0100  /* transfers are done using read/write multiple command */
 	int		multi;	/* # of blocks to transfer in multi-mode */
 	struct disklabel *lp;	/* pointer to drive's label info */
 	daddr_t		blkno;	/* block addr */
Index: dev/ata/wd.c
===================================================================
diff -u -p -u -r1.4 wd.c
--- dev/ata/wd.c	21 Sep 2002 03:34:46 -0000	1.4
+++ dev/ata/wd.c	1 Mar 2004 20:30:51 -0000
@@ -158,6 +158,7 @@ struct wd_softc {
 #define WDF_LBA		0x040 /* using LBA mode */
 #define WDF_KLABEL	0x080 /* retain label after 'full' close */
 #define WDF_LBA48	0x100 /* using 48-bit LBA mode */
+#define WDF_MULTI   0x200 /* device supports multi-sector transfers */
 	int sc_capacity;
 	int cyl; /* actual drive parameters */
 	int heads;
@@ -302,6 +303,7 @@ wdattach(parent, self, aux)
 
 	if ((wd->sc_params.atap_multi & 0xff) > 1) {
 		wd->sc_multi = wd->sc_params.atap_multi & 0xff;
+		wd->sc_flags |= WDF_MULTI;
 	} else {
 		wd->sc_multi = 1;
 	}
@@ -557,10 +559,12 @@ __wdstart(wd, bp)
 	 * the sector number of the problem, and will eventually allow the
 	 * transfer to succeed.
 	 */
-	if (wd->sc_multi == 1 || wd->retries >= WDIORETRIES_SINGLE)
+	if (wd->retries >= WDIORETRIES_SINGLE)
 		wd->sc_wdc_bio.flags = ATA_SINGLE;
 	else
 		wd->sc_wdc_bio.flags = 0;
+    if (wd->sc_flags & WDF_MULTI)
+		wd->sc_wdc_bio.flags |= ATA_MULTI;      
 	if (wd->sc_flags & WDF_LBA48)
 		wd->sc_wdc_bio.flags |= ATA_LBA48;
 	if (wd->sc_flags & WDF_LBA)
Index: dev/ata/ata_wdc.c
===================================================================
diff -u -p -u -r1.4 ata_wdc.c
--- dev/ata/ata_wdc.c	21 Sep 2002 03:34:45 -0000	1.4
+++ dev/ata/ata_wdc.c	1 Mar 2004 20:34:44 -0000
@@ -366,7 +366,7 @@ again:
 		} /* else not DMA */
 		ata_bio->nblks = min(nblks, ata_bio->multi);
 		ata_bio->nbytes = ata_bio->nblks * ata_bio->lp->d_secsize;
-		if (ata_bio->nblks > 1 && (ata_bio->flags & ATA_SINGLE) == 0) {
+		if (ata_bio->nblks > 1 && (ata_bio->flags & ATA_MULTI)) {
 			cmd = (ata_bio->flags & ATA_READ) ?
 			    WDCC_READMULTI : WDCC_WRITEMULTI;
 		} else {
@@ -785,7 +785,7 @@ again:
 
 	case MULTIMODE:
 	multimode:
-		if (ata_bio->multi == 1)
+		if ((ata_bio->flags & ATA_MULTI) == 0 || ata_bio->multi == 1)
 			goto ready;
 		wdccommand(chp, xfer->drive, WDCC_SETMULTI, 0, 0, 0,
 		    ata_bio->multi, 0);

--fdj2RfSjLxBAspz7--