Subject: Re: Anyone working on ATA over Ethernet?
To: Daniel Carosone <dan@geek.com.au>
From: Jason Thorpe <thorpej@shagadelic.org>
List: tech-kern
Date: 02/14/2005 19:16:45
On Feb 14, 2005, at 5:18 PM, Daniel Carosone wrote:

> On their sales quotes for the other
> side, iSCSI replaces somewhat expensive FCAL HBA's with even more
> expensive HBA's that offload iSCSI and TCP processing to the card.

Perhaps we're getting off on a tangent here, but...

As I see it, one of the main flaws of iSCSI is the "software-ness" of 
it.  It's not a flaw in the protocol, but it's a flaw when you consider 
the expectations people have of their storage system.  They want 
110MB/s for each Gig-E port, but that comes with a CPU usage cost, a 
cost that you do not pay with FC because FC offloads all of that 
transport and link processing.  Sure, you get the performance you 
expect out of your storage system, but you don't get the performance 
you expect out of the application running on the hosts that want to 
talk to the storage system, because they don't have much CPU left.

Sure, you can eliminate the CPU usage with iSCSI by purchasing an 
expensive iSCSI offload adapter, but now where is the cost savings?  
They go "*poof*", as you observe.  Sure, you save on the FC switch, but 
a high-end Gig-E switch that can support jumbo frames and traffic 
shaping ain't exactly chopped liver...  Nevermind that to get the 
performance of one FC port, you need *TWO* Gig-E ports.  And forget 
about having a transaction latency as low as FC... iSCSI's protocol 
overhead is just plain higher (Ethernet, IP, TCP, *and* the iSCSI 
transport protocol).

> If you want all the performance and throughput all the way through,
> you pay that price.  If you don't need it, but you still want some of
> the virtualisation flexibility and/or bulk capacity, options and
> offerings are limited.

I personally see iSCSI's niche as being "really cheap, moderately 
performing, expandable bulk storage".  Data archiving, etc.  "I need to 
put it away for 7 years, never touching it, until it's time to erase 
it."

> The RAID controllers are getting the smarts, and are learning to speak
> iSCSI for such purposes as cross-site replication.  Cluster
> filesystems on hosts, slowly, too.

Cluster file systems on hosts don't need to speak iSCSI.  To any 
clustered file system, iSCSI looks *exactly* like FC, except for the 
use of IQNs rather than WWNs for LU naming.  As for cross-site 
replication, I don't see iSCSI as being particularly valuable for that. 
  From my perspective, this is best done at the FILE layer, *not* the 
block layer.

> Perhaps someone figures they can
> get even better cost performance on the local backend interconnect
> using ata-over-ethernet (with gig-e jumbo frames at least, I hope), by
> doing away with many aspects of iSCSI that don't really apply behind
> the raid controller.
>
> It might also make sense for low-cost compute clusters and blade
> server type environments, I guess.

Anything-over-Ethernet has to account for the fact that frames can be 
delivered out-of-order; Ethernet provides no ordering guarantees.  
Eventually, you have to solve basically all the same problems that 
iSCSI had to solve (PDUs can be delivered out-of-order if you have 
multiple TCP connections in your iSCSI sessions).  SATA doesn't have 
this problem; the link layer provides in-order frame delivery, if I 
recall correctly.

-- thorpej