Subject: CompactFLash performance improvement
To: None <>
From: Artem Belevich <>
List: tech-kern
Date: 02/27/2004 17:10:45
Some time back I've whined about painfully slow I/O on CompactFlash
cards in IDE mode. Looks like it could be improved.

To recap, under NetBSD/FreeBSD compactflash operations are several
times slower than they are under Windows or Linux.

Recently I've spent some time tracking the problem and it looks like
with some driver changes it would be possible to speed it up.

The issue is that:
a) most CompactFlash cards claim to be capable of 1-sector PIO0
   transfers only. I think it was a requirement in the CF+ spec before
b) CF cards have fairly long command-to-DRQ delay. One of Sandisk CF datasheets
   says that it can be up to ~1.2ms. My measurements show the number pretty close 
   to that.
c) NetBSD driver issues a read command for *every* sector.

Here's a timeline for reading a single sector during 'dd bs=8192 count=<a lot>'.

	     0   issue read command with sector count=1
	+~1300us interrupt happens  (first command takes long)
	+~ 250us fetch the data from the disk

        while(have sectors to read){
	    0	issue read command with sector count=1
	+~300us interrupt happens
	+~250us fetch the data from the disk

	+  20us issue next command

All in all it's ~570us/sector or <1Mb/s.

I've experimented a bit with ATA drivers I've got from and with the same CF card I'm able
to achieve ~3.3Mb/s read rate compared to ~900K/s I've got from the
same card with NetBSD.

The main difference between's code and NetBSD is that
NetBSD issues read command with 1-sector count and's
driver reads several sectors at a time (32 in my case). 

The timings in this case looks like this:

	     0   issue read command with sector count=N
	+~1300us interrupt happens  (first command takes long)
	+~ 250us fetch the data from the disk

	while (more sectors) {
	         wait for next interrupt
	+   11us interrupt happens  (*** only 11us until CF delivers next sector.
	+  250us fetch data

In this case it's ~260us/sector or ~2Mb/s.

So, if we could change NetBSD driver to issue single read command for
N secors at a time, we'd be able to speed-up CF I/O quite a bit.  

Note that I'm talking about regular read/write commans (0x20/0x30),
not read-multiple/write-multiple commands (0xc4/0xc5). I

Now, I'm not an expert on ATA/IDE and I'd appreciate if someone more
familiar with the subject could take a look at the issue at see if and
how it can be dealt with.

I wonder if the issue merits opening a PR for it.


On Sun, Jul 14, 2002 at 07:26:00AM +0200, Wojciech Puchar <> wrote:
> > Not the one(s) I have.
> >
> > wd1 at wdc0 channel 0 drive 0: <Hitachi CVM1.3.3>
> > wd1: drive supports 1-sector PIO transfers, LBA addressing
> > wd1: 244 MB, 695 cyl, 15 head, 48 sec, 512 bytes/sect x 500400 sectors
> >
> > So far all the CompactFlash cards I saw (sandisk/lexar/io data/several
> > no-names) reported themselves as PIO 0 with 1-sector-at-atime transfer
> > capability. :-( I believe, that's what one of the earlier CF specs
> > insisted upon.
> >
> > I saw some references to the multisector-capable CF cards on the web,
> > but I haven't seen any of them myself yet.
> i think it's possible they really can multisector but don't report. and
> with >1sec/s will be same 3000us delay
> >
> > --Artem
> >
> > On Sat, Jul 13, 2002 at 08:27:23AM +0200, Wojciech Puchar <> wrote:
> > > > snapshot for both windows and NetBSD. This should at least give us a
> > > > hint why are we so slow.
> > > >
> > > > What I see is that single-sector read consists of
> > > >
> > > > 	command + 275us wait time + 150us data transfer time = 430us.
> > > >
> > > > Write, on the other hand, looks a bit different:
> > > > 	command+150us data transfer+2850us wait time = 3000us
> > > >
> > > aren't these IDE-flash devices capable of IDE multisector?
> > >
> >
> --------------------------------------------------------------------
> Charakterystycznymi cechami rozwoju oprogramowania jest wyk?adniczy
> wzrost wymaga? sprz?towych, kwadratowy wzrost ilo?ci b??d?w, liniowy
> wzrost ilo?ci bajer?w przy mniej ni? liniowym wzro?cie funkcjonalno?ci