tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ata(4) and NCQ



On Tue, Apr 19, 2011 at 07:57:56PM +0000, Jonathan A. Kollasch wrote:
> Hi,
> 
> I'm (again) considering adding NCQ support to the ATA subsystem.
> 
> I've gained some experience with out of order transaction completion
> in a virtio block driver I've worked on, however that was rather
> simplistic compared to shoehorning it into existing and somewhat
> dissimilar code.
> 
> I am rather bewildered by the prospects of redesigning the whole ATA
> subsystem.  Basically, I am looking for ideas on a proper design of
> NCQ support, so any pointers on where to start would be appreciated.

I believe FreeBSD ultimately ended up faking up ATA disks to look like
SCSI ones, and using their existing SCSI midlayer to manage tags.

It would be good to avoid that.  The SCSI code is really large.

However, it does do what's wanted, so perhaps a careful examination of
how it manages tags and openings would be a good first step.

I believe the NCQ rules are much like the SCSI tag ordering rules: tags
finish in whichever order, but ordered tags are barriers.  We use simple
(non-ordered) tags for all reads, but ordered tags for writes that have
B_SYNC, IIRC.  It may be we have reverted to using them for all writes.

For SCSI targets, even simple tags don't get reordered on write (though
you still get the benefit that many can be pending at once, letting the
drive queue them up and fill its track caches etc) unless a non default
tag reordering policy is selected in the device's mode pages.

With appropriate use of ordered tags and/or cache flushes as barriers,
it should always be safe -- and highly advantageous -- to use such a
policy.  You basically get the performance benefit of running with write
back caching enabled, but without the danger of data loss.  An example
of a kernel that really gets this right with regard to SCSI disks is
Irix, where you can turn the write cache off and on willy-nilly and
performance basically doesn't change (it's good regardless, since they
whack the mode pages to allow the drive to reorder, and use ordered tags
as barriers).

How this maps onto ATA drives and NCQ, I don't know, but the overall
issues and history are worth keeping in mind.

Thor


Home | Main Index | Thread Index | Old Index