A comparison of AoE to FC and iSCSI protocols

One of the first issues I have to contend with when talking about Coraid storage and its use of the ATA-over-Ethernet (AoE) protocol to transfer data, is the response “Ethernet? Oh, so it’s iSCSI then?”.

No it isn’t….

AoE was built from the ground up as an open source data transfer protocol, specifically concerned with finding the most efficient way to transmit raw disk I/O commands over raw Ethernet, and keeping the overhead as low as possible to maximize the throughput.  

In many ways AoE is more akin to Fibre Channel (FC) than it is to iSCSI in that it is a non routable protocol designed for locally based storage rather than sending data over the Internet. Like FC, AoE can be made to route over the Internet when it needs to, such as in site-to-site DR applications, but the non routable nature of the protocol makes accidental exposure of data to non authorized networks that much harder.

So in order to help differentiate the data transfer protocols upon which all your networked storage systems are based, this blog entry is here to help dispel some of the myths about AoE. 

The only real comparison of AoE and iSCSI is that they both use Ethernet as the transport medium. iSCSI uses TCP/IP at Layer 4 and AoE Layer 2, but after that things get very different. 

Data delivery the iSCSI way 

The diagram below (click to enlarge) shows how data is sent from a client to a disk device using the iSCSI protocol.

 

Image002

iSCSI is a connection based topology, as is FC, and therefore requires sequenced serial delivery of the data packets over the network. Each 64K I/O transfer is wrapped in a iSCSI header with a CRC appended and broken up into segments for transmission. These segments themselves are inserted in a TCP/IP wrapper with another CRC, and the resultant Ethernet frames are then sent in a sequential manner down a single connection.

It must be stressed here that no matter how many iSCSI ports you have on your servers and storage, only 1 of these is used at any one time to transmit data due to the connection requirement. The other ports can be utilized in a round robin fashion to increase throughput, but this basic single channel transfer method is key to how iSCSI works – iSCSI cannot distribute I/O between multiple ports.

This can be seen clearly in a VMware environment, when looking at the paths to the storage in the hypervisor client shows 1 active path and all others in standby. On failure of the iSCSI active path (and a look in the VMKernel log shows you do get quite a few!), one of the standby paths is then made active.

So iSCSI does a lot of work to get data from initiator to target, and once at the target everything needs to be reassembled in the correct order and checked for integrity. After this the data is still streamed to disk via 512 byte disk sectors (most likely for a 64k I/O). Disk I/O is held off until entire I/O is reassembled, and TCP imposes significant latency if data is dropped in the network.

FC faces the same connectivity problems and has a difficult time using multiple network paths, although this is made up for by the transmission medium and very expensive switches and controllers. AoE just requires commodity layer 2 switches with the ability to provide jumbo frame support and a decent throughput capability.

Data delivery the AoE way

The diagram below (click to enlarge) shows how data is sent from a client to a disk device using the AoE protocol.

 

Image001

As you can see, this is significantly different from the iSCSI diagram above. Here the 64K I/O is simply split into 8K blocks, and each one of these blocks is placed into an Ethernet frame with AoE header and a CRC appended. This is why it is important for jumbo frames to be enabled for AoE networks, as it allows each 8K block to occupy a single Ethernet frame for maximum efficiency.

AoE then sends these disk I/O datagrams in parallel over the network, utilizing all available ports automatically. AoE does not require sessions or sequence numbers, and each AoE Frame is an idempotent 8k disk I/O. Frames can arrive out of order, or not at all – if a Frame (or returning Ack) is lost, the initiator will resend within microseconds (vs. standard 200mS TCP timeout). 

This makes AoE much more efficient than iSCSI (parallel vs. serial I/O), and that doesn’t include TCP overhead. In practice throughput for AoE is 2x to 4x more than with equivalent iSCSI (4 initiator iSCSI connections compared to 4 AoE ports) – allowing peak throughput approaching 2 Gbyte/s for a single Coraid shelf of 15K disks with 2x 10Gbit AoE CX4 ports. This efficiency, along with the low latency design of the Coraid appliances, also allows the maximum amount of disk IOPS to be delivered from the storage network. 

It is very important that the parallel nature of transmission is highlighted as this means that simply adding more ports to an AoE network can dramatically increase throughput; and because the use of all ports is automatic, with no MPIO configuration required AT ALL, this increase in throughput becomes a simple plug and play operation that can be performed in minutes. 

In a previous blog entry we highlighted some of the many misconceptions that AoE suffers, being an unknown protocol in a world dominated by FC and iSCSI vendors. Answers there should be read alongside the information given in this blog entry, a summary of which appears below:

Summary of differences discussed between FC/iSCSI and AoE protocols

 

iSCSI/FC

AoE

Topology

Connection

Connectionless

Transport

 IO Session/Sequences

 Block Datagrams

IO Transfer

Serial delivery

IO TCP reassembly

Parallel delivery

Datagram Disk IO

Multi-Path

Multipath Software

Manual Setup

Automatic

Plug and Play

Dropped Data

iSCSI: TCP with Retransmit

FC: Link Flow Control, IO timeout

Datagram re-transmit

Out of Order Data

iSCSI: TCP block delivery

FC: Prevented via Link Level Flow Control

In order arrival not mandated

Overhead

Fibre Channel is up to 2-4x faster performance than iSCSI

AoE is 30% more efficient than Fibre Channel

Source: Coraid Inc

Conclusion

When approaching Coraid opportunities I do try to impress that AoE is an alternative to Fibre Channel, and not another flavour of iSCSI, and it is in the FC market that I see the most benefit is to be had, especially when comparing to Fibre Channel over Ethernet (FCoE) – which is at best an interim fix to allow organisations with heavy investment in FC hardware to migrate to Ethernet without wasting that investment. Once a full hardware refresh becomes viable in an organisation I see full Ethernet solutions will really see their day, and AoE is far better placed than iSCSI to take over where FC leaves off.

In conclusion I hope it can now be seen in what ways AoE is definitely NOT iSCSI, and being based on Ethernet is a much better choice to efficiently and relatively cheaply connect your storage networks.

For more details or a discussion on what AoE can do for your Enterprise Storage drop us a line at info@millennia.it or visit our website via the link in the blog profile opposite.

 

This entry was posted in Uncategorized and tagged , , , , , . Bookmark the permalink.

4 Responses to A comparison of AoE to FC and iSCSI protocols

  1. garegin says:

    as far as i understand iscsi is not limited to ethernet. it is layer 2 agnostic and can run over the internet or wifi if need be.

    • millennia97 says:

      Look at the age of the article, it is biased, poorly researched and generally rubbish. Coraid are still around and have a large presence in Cloud infrastructure so clearly are in very large datacentres!
      As far as retransmission goes it uses datagram retransmission so you are talking microseconds instead of the millisecond TCP retransmission, and splits everything into parallel streams so that all available paths are used simultaneously.
      Security can be an issue and the network should be physically segregated and port locked. You can use MAC filtering and for remote sites routing isn’t an issue as Coraid have a tunneling device that bridges networks.
      Fact is saying it can’t work is ridiculous because is DOES work, so this is just one of those articles you consign to the bin as observation trumps bad theory.

      In any case I debunked these statements in another blog post here: http://blog.millennia.it/2010/09/28/clearing-up-some-misconceptions-about-the-aoe/ 😉

  2. Badiane says:

    I just realized that it was addressed.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s