Subject: Re: Anyone working on ATA over Ethernet?
To: Daniel Carosone <dan@geek.com.au>
From: Jason Thorpe <thorpej@shagadelic.org>
List: tech-kern
Date: 02/14/2005 19:16:45
On Feb 14, 2005, at 5:18 PM, Daniel Carosone wrote:
> On their sales quotes for the other
> side, iSCSI replaces somewhat expensive FCAL HBA's with even more
> expensive HBA's that offload iSCSI and TCP processing to the card.
Perhaps we're getting off on a tangent here, but...
As I see it, one of the main flaws of iSCSI is the "software-ness" of
it. It's not a flaw in the protocol, but it's a flaw when you consider
the expectations people have of their storage system. They want
110MB/s for each Gig-E port, but that comes with a CPU usage cost, a
cost that you do not pay with FC because FC offloads all of that
transport and link processing. Sure, you get the performance you
expect out of your storage system, but you don't get the performance
you expect out of the application running on the hosts that want to
talk to the storage system, because they don't have much CPU left.
Sure, you can eliminate the CPU usage with iSCSI by purchasing an
expensive iSCSI offload adapter, but now where is the cost savings?
They go "*poof*", as you observe. Sure, you save on the FC switch, but
a high-end Gig-E switch that can support jumbo frames and traffic
shaping ain't exactly chopped liver... Nevermind that to get the
performance of one FC port, you need *TWO* Gig-E ports. And forget
about having a transaction latency as low as FC... iSCSI's protocol
overhead is just plain higher (Ethernet, IP, TCP, *and* the iSCSI
transport protocol).
> If you want all the performance and throughput all the way through,
> you pay that price. If you don't need it, but you still want some of
> the virtualisation flexibility and/or bulk capacity, options and
> offerings are limited.
I personally see iSCSI's niche as being "really cheap, moderately
performing, expandable bulk storage". Data archiving, etc. "I need to
put it away for 7 years, never touching it, until it's time to erase
it."
> The RAID controllers are getting the smarts, and are learning to speak
> iSCSI for such purposes as cross-site replication. Cluster
> filesystems on hosts, slowly, too.
Cluster file systems on hosts don't need to speak iSCSI. To any
clustered file system, iSCSI looks *exactly* like FC, except for the
use of IQNs rather than WWNs for LU naming. As for cross-site
replication, I don't see iSCSI as being particularly valuable for that.
From my perspective, this is best done at the FILE layer, *not* the
block layer.
> Perhaps someone figures they can
> get even better cost performance on the local backend interconnect
> using ata-over-ethernet (with gig-e jumbo frames at least, I hope), by
> doing away with many aspects of iSCSI that don't really apply behind
> the raid controller.
>
> It might also make sense for low-cost compute clusters and blade
> server type environments, I guess.
Anything-over-Ethernet has to account for the fact that frames can be
delivered out-of-order; Ethernet provides no ordering guarantees.
Eventually, you have to solve basically all the same problems that
iSCSI had to solve (PDUs can be delivered out-of-order if you have
multiple TCP connections in your iSCSI sessions). SATA doesn't have
this problem; the link layer provides in-order frame delivery, if I
recall correctly.
-- thorpej