Subject: Re: followup to ATA flash newfs problem (WAS: newfs problem: "cg 0: bad magic number")
To: Tad Hunt <tad@entrisphere.com>
From: Artem Belevich <art@riverstonenet.com>
List: tech-kern
Date: 07/12/2002 10:48:33
On Thu, Jul 11, 2002 at 12:55:27PM -0700, Tad Hunt <tad@entrisphere.com> wrote:
> 1) The eFilm card is extremely slow.  It only does PIO mode 1, and
>    the write speed seems to be just over 150KB/sec.  Contrast this
>    to the "SMART Modular technologies ATA PC Card", which does PIO
>    Mode 4, and seems to top out just over 800KB/sec.

Don't be too fast to blame everything on the flash card.  There may be
something wrong with ATA Flash/CompactFlash reading/writing
performance in NetBSD in general.

I've got fairly fast CompactFlash card (which claims to be PIO mode 0,
as CF spec says) and performed some read/write tests on Win2k, Linux
and NetBSD. The results are very strange.

		Read	Write
w2k		~900K/s	~900K/s
Linux		~900K/s	~900K/s
NetBSD(2)	~1.1M/s	~150K/s  
NetBSD(1)	~1.1M/s	~150K/s  - looks suspiciously close to your 150K/s

NetBSD:
	(1) - CF card attached to an IDE controller in True-IDE mode
	(2) - CF card is in PCMSIA slot (through CF-ATA adapter)
w2k/Linux:
	CF is in USB CF reader (Delkin eFilm). I think that USB is the
	limiting factor for CF performance.

Now, why is NetBSD so slow? The problem is not i386 specific, as I get
*exactly* the same performance from CF under powerpc/walnut port. Is
there something in wdc driver that needs tweaking?

If some of the IDE gurus are willing to investigate this I can get
logic analyzer hooked up to the CF card and get the IDE bus activity
snapshot for both windows and NetBSD. This should at least give us a
hint why are we so slow. 

What I see is that single-sector read consists of 

	command + 275us wait time + 150us data transfer time = 430us. 

Write, on the other hand, looks a bit different:
	command+150us data transfer+2850us wait time = 3000us

My interpretation is that wait time is for IORDY signal that should
trigger interrupt. So, 3ms per sector gives us ~300 sectors/s which
gives us exactly 150K/s write performance.

I wonder if we can just poll for the write completion instead of
waiting for interrupt to happen? How do we turn polling mode then?

--Artem