Fibre Channel vs. iSCSI – The Great Debate Generates Questions Galore

The SNIA Ethernet Storage Forum recently hosted the first of our “Great Debates” webcasts on Fibre Channel vs. iSCSI. The goal of this series is not to have a winner emerge, but rather provide vendor-neutral education on the capabilities and use cases of these technologies so that attendees can become more informed and make educated decisions. And it worked! Over 1,200 people have viewed the webcast in the first three weeks! And the comments from attendees were exactly what we had hoped for:

“A good and frank discussion about the two technologies that don’t always need to compete!”

Really nice and fair comparison guys. Always well moderated, you hit a lot of material in an hour. Thanks for your work!”  

“Very fair and balanced overview of the two protocols.”

“Excellent coverage of the topic. I will have to watch it again.”

If you missed the webcast, you can watch it on-demand at your convenience and download a copy of the slides.

The debate generated many good questions and our expert speakers have answered them all:

Q. What is RDMA?

A. RDMA is an acronym for Remote Direct Memory Access. It is a part of a protocol through which memory addresses are exchanged between end points so that data is able to move directly from the memory in one end point over the network to the memory in the other end point, without involving the end point CPUs in the data transfer. Without RDMA, intermediate copies (sometimes, multiple copies) of the data are made on the source end point and the destination end point.

RoCEv1, RoCEv2, iWARP, and, and InfiniBand are all protocols that are capable of performing RDMA transfers. iSER is iSCSI over RDMA often uses iWARP or RoCE. SRP is a SCSI RDMA protocol that runs only over InfiniBand. FC uses hardware based DMA to perform transfers without the need to make intermediate copies of the data, therefore RDMA is not needed for FC, and does not apply to FC.

Q. Can multi-pathing be used for load balancing or high availability?

A.  Multi-pathing is used both for load balancing and for high availability. In an active-passive setup it is used only for high-availability, while in an active-active setup it is used for both.

Q. Some companies are structured so that iSCSI is handled by network services and the storage team supports FC, so there is storage and network overlap. Network people should be aware of storage and reverse.

A. Correct, one of the big tradeoffs between iSCSI and FC may end up not being a technology tradeoff at all. In some environments, the political and organizational structure plans as much a part of the technology decision as the technology itself. Strong TCP/IP network departments may demand that they manage everything, or they may demand that storage traffic be kept as far from their network as possible. Strong storage network departments may demand their own private networks (either TCP/IP for iSCSI, or FC).

In the end, the politics may play as important a role in the decision of iSCSI vs. FC as the actual technology itself.

Q. If you have an established storage network (i.e. FC/iSCSI) is there a compelling reason you would switch?

A. Typically, installations grow by adding to their existing configuration (iSCSI installations typically add more iSCSI, and FC installations add more FC). Switching from one technology to another may occur for various reasons (for example, the requirements of the organization have changed such that the other technology better meets the organizational needs or a company merger dictates a change).

Fibre Channel is at 32/128Gb now. iSCSI is already in product at 100Gb, and at 200/400Gb next and so on. In short, Ethernet currently has a shorter speed upgrade cycle than FC. This is especially important now that SSDs have arrived on the scene. With the performance available from the SSD’s, the SAN is now the potential choke point. With the arrival of Persistent Memory, this problem can be exacerbated yet again and there the choice of network architectures will be important.

One of the reasons why people might switch has very little to do with the technology, but more to do with other ancillary reasons. For instance, iSCSI is something of an “outside-in” management paradigm, while Fibre Channel has more of an “inside-out” paradigm. That is, management is centralized in FC, where iSCSI has many more touch-points [link: http://brasstacksblog.typepad.com/brass-tacks/2012/02/fc-and-fcoe-versus-iscsi-network-centric-versus-end-node-centric-provisioning.html]. When it comes to consistency at scale, there are major differences in how each storage network handles management as well as performance. Likewise, if programmability and network ubiquity is more important, then Ethernet-based iSCSI is an appealing technology to consider.

Q. Are certain storage vendors recommending FC over iSCSI for performance reasons because of how their array software works?

A.  Performance is not the only criteria, and vendors should be careful to assess their customers’ needs before recommending one solution over another. If you feel that a vendor is proposing X because, well, X is all they have, then push back and insist on getting some facts that support their recommendation.

Q. Which is better for a backup solution?

 A.  Both FC and iSCSI can be used to backup data. If a backup array is emulating a tape library, this is usually easier to do with FC than with iSCSI. Keep in mind that many backup solutions will run their own protocol over Ethernet, without using either iSCSI or FC.

Q. I disagree that Ethernet is cheaper. If you look at cost of the 10/25Gb SFP+/SFP28 transceivers required vs. 16/32Gb transceiver costs, the FC solution is on par or in some cases, cheaper than Ethernet solutions. If you limit Ethernet to 10GBASE-T, then yes, it is cheaper.

A.  This is part of comparing apples to apples (and not to pineapples). iSCSI solutions typically are available in a wider range of price choices from 1Gb to 100Gb speeds (there are more lower cost solutions available with iSCSI than with FC). But, when you compare environments with comparable features, typically, the costs of each solution are similar. Note that 10/25Gb Ethernet supports DAC (direct-attach copper) cables over short distances—such as within a rack or between adjacent racks—which do not require separate transceivers.

Q. Do you know of a vendor that offers storage arrays with port speed higher than 10Gbs? How is 50Gbs and 100Gbs Ethernet relevant if it’s not available from storage vendors?

A. It’s available now for top-of-rack switches and from flash storage startups, as well as a few large storage OEMs supporting 40GbE connections. Additional storage systems will adopt it when it becomes necessary to support greater than 1GB (that’s a gigabyte!) per second of data movement from a single port, and most storage systems already offer multiples of 10Gbps ports on a single system.

100GbE iSCSI is in qualification now and we expect there will be offerings from tier-1 storage OEMs later this year. Similarly, higher-speed Fibre Channel port speeds are in the works. However, it’s important to note that at the port level, speed is not the only consideration: port configuration becomes increasingly important (e.g., it is possible to aggregate Fibre Channel ports up to 16x the speed of each individual port; Ethernet aggregation is possible too, but it works differently).

Q. Why there are so few vendors in FC space?

A. Historically, FC started with many vendors. Over the life of FC development, a fair number of mergers and acquisitions has reduced the number of vendors in this space. Today, there are 2 primary switch vendors and 2 primary adapter vendors.

Q. You talk about reliable, but how about stable and predictable?

A.  Both FC and iSCSI networks can be very stable and predictable. Because FC-SAN is deployed only for storage and has fewer vendors with well-known configurations, it may be easier to achieve the highest levels of stability and predictability with less effort when using FC-SAN. iSCSI/Ethernet networks have more setup options and more diagnostic or reporting tools available so may be easier to monitor and manage iSCSI networks at large scale once configured.

Q. On performance, what’s the compare on speed related to IOPs for FC vs. iSCSI?

A.  IOPS is largely a function of latency and secondarily related to hardware offloads and bandwidth. For this reason, FC and iSCSI connections typically offer similar IOPS performance if they run at similar speeds and with similar amounts of hardware offload from the adapter or HBA. Benchmark reports showing very high IOPS performance for both iSCSI and Fibre Channel are available from 3rd party analysts.

Q. Are there fewer FC ports due to the high usage of blade chassis that share access or due to more iSCSI usage?

A.  It is correct that most blade servers use Ethernet (FCoE, iSCSI, or NFS), but this is a case of comparing apples and pineapples. FC ports are used for storage in a data center. Ethernet ports can be used for storage in a data center, but they are also used in laptops and desktops for e-mail, web browsing; wireless control of IoT (Internet of Things – e.g., light bulbs, thermostats, etc.); cars (yes, modern automobiles have their own Ethernet network); and many other things. So, if you compare the number of data center storage ports to the number of every other port used for every other type of network traffic, yes, there will be a smaller number associated with only the data center storage ports.

Q. Regarding iSCSI offload cards, we used to believe that software initiators were faster because they could leverage the fast chips in the server. Have iSCSI offload cards changed significantly in recent years?

A. This has traditionally been a function of the iSCSI initiator offload architecture. A full/cmd offload solution tends to be slower since it executes the iSCSI stack on a slow processor firmware in the NIC. A modern PDU-based solution (such as supported by the Open-iSCSI on Linux), only offloads performance critical applications to the silicon and is just as low latency as the software initiator and perhaps lower.

Q. I think one of the more important differences between FC and iSCSI is that a pure FC network is not routable whereas iSCSI is, because of the nature of the protocol stack each one relies on. Maybe in that sense iSCSI has an advantage, especially when we think in hybrid cloud scenarios increasingly more common today. Am I right?

A.  Routability is usually discussed in the context of the TCP/IP network layering model i.e. how traffic moves through different Ethernet switches/routers and IP domains to get from the source to the destination.

iSCSI is built on top of TCP/IP, and, hence, iSCSI benefits from interoperating with existing Ethernet switching/routing infrastructure, and not requiring special gateways when leaving the data center, for example, in the hybrid cloud case.

The industry has already developed other standards to carry Fibre Channel over IP: FCIP. FCIP is routable, and it is already part of the FC-BB-5 standard that will also include FCoE.

Q. This is all good info, but this is all contingent on the back-end storage, inclusive of the storage array/SAN/NAS/server and disks/SSD/NVMe, actually being able to take advantage of the bandwidth. SAN vendors have been very slow to adopt these larger pipes.

A.  New technologies have adoption curves, and to be frank, adoption up the network speed curve has been slow since 10Gbps. A lot of that is due to disk technologies; they haven’t gotten that much faster in the last decade (bigger, yes, but not faster; it’s difficult to drive a big expensive pipe efficiently with slow drives.). Now with SSD and NVMe (and other persistent memories technologies to come), device latency and bandwidth have become a big issue. That will drive the adoption not only of fatter pipes for bandwidth, but also RDMA technologies to reduce latency.

Q. What is a good source of performance metrics for data on CPU requirements for pushing/pulling data. This is in reference to the topic of “How can a server support 100/ Gb/s?”

Q. Once 100Gb iSCSI is offloaded via special adapter cards, there should be no additional load imposed on the server than any other 100Gb link would require. Websites of independent testing companies (e.g. Demartek) should provide specific information in this regard.

Q. What about iSCSI TLV

A. This is a construct for placing iSCSI traffic on specific classes of service in a DCBX switch environment, which in turn is used when using a no-drop environment for iSCSI traffic; i.e., it’s used for “lossless iSCSI.” iSCSI TLV is a configuration setting, not a performance setting. All it does is allow an Ethernet switch to tell an adapter which Class of Service (COS) setting it’s using.

However, this is actually not necessary for iSCSI, and in some cases [see e.g. https://blogs.cisco.com/datacenter/the-napkin-dialogues-lossless-iscsi] may actually be undesirable. iSCSI is built on TCP and it inherits the reliability features from the underlying TCP layer and does not need a DCBX infrastructure. In the case of hardware offloaded iSCSI, if a loss is observed in the system, the TCP retransmissions happen at silicon speeds without perturbing the host software and the resulting performance impact to the iSCSI traffic is insignificant. Further, Ethernet speeds have been rising rapidly, and have been overcoming any need for any type of traffic pacing.

Q. How far away is standard-based NVMe over 100G Ethernet? Surely once 100GE starts to support block storage applications, is 128G FC now unattractive?

A.  NVMe over Fabrics (NVMe™-oF) is a protocol that is independent of the underlying transport network. That is, the protocol can accept any speed of the transport underneath. The key thing, then, is when you will find Operating System support for running the protocol with faster transport speeds.

For instance, NVMe-oF over 10/25/40/50/100G Ethernet is available with RHEL7.4 and RHEL7.5. NVMe-oF over high-speed Fibre Channel will be dependent upon the adapter manufacturers’ schedule, as the qualification process is a bit more thorough. It may be challenging for FC to keep up with the Ethernet ecosystem, either in price, or with the speed of introducing new speed bumps, due to the much larger Ethernet ecosystem, but the end-to-end qualification process and ability to run multi-protocol deterministic storage with Fibre Channel networks often surpass raw speeds for practical use.

Q. Please comment on the differences/similarities from the perspectives of troubleshooting issues.

A. Both Fibre Channel and iSCSI use similar troubleshooting techniques. Tools such as traceroute, ping, and others (the names may be different, but the functionality is the same) are common across both network types.

Fibre Channel’s troubleshooting tools are available at both the adapter level and the switch level, but since Fibre Channel has the concept of a fabric, many of the tools are system-wide. This allows for many common steps to be taken in one centralized management location.

Troubleshooting of TCP/IP layer of iSCSI is no different than the rest of TCP/IP that the IT staff is used to and standard debugging tools work. Troubleshooting the iSCSI layer is very similar to FC since they both essentially appear as SCSI and essentially offer the same services.

Q. Are TOE cards required today?

A. TOE cards are not required today. TCP Offload Engines (TOEs) have both advantages and disadvantages. TOEs are more expensive than ordinary non-TOE Network Interface Chips (NICs). But, TOEs reduce the CPU overhead involved with network traffic. In some workloads, the extra CPU overhead of a normal NIC is not a problem, but in other heavy network workloads, the extra CPU overhead of the normal NIC reduces the amount of work that the system is able to perform, and the TOE provides an advantage (by freeing up extra CPU cycles to perform real work).

For 10Gb, you can do without an offload card if you have enough host CPU cycles at your disposal, or in the case of a target, if you are not aggregating too many initiators, or are not using SSDs and do not need the high IOPs. At 40Gb and above, you will likely need offload assist in your system.

Q. Are queue depths the same for both FC and iSCSI? Or are there any differences?

A.  Conceptually, the concepts of queue depth are the same.   At the SCSI layer, queue depth is the number of commands that a LUN may be concurrently operated on.   When that number of outstanding commands is achieved, the LUN refuses to accept any additional commands (any additional commands are failed with the status of TASK SET FULL). As a SCSI layer concept, the queue depth is not impacted by the transport type (iSCSI or FC).   There is no relationship between this value and concepts such as FC Buffer Credits, or iSCSI R2T (Ready to Transfer).   In addition, some adapters have a limit on the number of outstanding commands that may be present at the adapter layer.

As a result of interactions between the queue depth limits of an individual LUN, and the queue depths limits of the adapters, hosts often allow for administrative management of the queue depth.   This management enables a system administrator to balance the IO load across LUNs so that a single busy LUN does not consume all available adapter resources.   In this case, the queue depth value set at the host is used by the host as a limiter of the number of concurrent outstanding commands (rather than waiting for the LUN to report the TASK SET FULL status) Again, management of these queue depth values is independent of the transport type.   However, on some hosts, the management of queue depth may appear different (for example, the commands used to set a maximum queue depth for a LUN on an FC transport vs. a LUN on an iSCSI transport may be different).

Q. Is VMware happy more with FC or ISCSI, assuming almost the same speed? What about the network delay in contrast with the FC protocol which (is/was faster)?

A. Unfortunately, we can’t comment on individual company’s best practice recommendations. However, you can refer to VMware’s Best Practices Guides for Fibre Channel and iSCSI:

Best Practices for Fibre Channel Storage

Best Practices For Running VMware vSphere on iSCSI  

Q. Does iSCSI have true load balancing when Ethernet links are aggregated? Meaning the links share even loads? Can it be done across switches? I’m trying to achieve load balancing and redundancy at the same time.

A. In most iSCSI software as well as hardware offload implementations load-balancing is supported using “multi-pathing” between a server and storage array which provides the ability to load-balance between paths when all paths are present and to handle failures of a path at any point between the server and the storage. Multi-pathing is also a de facto standard for load-balancing and high-availability in most Fibre Channel SAN environments.

Q. Between FC and iSCSI, what are the typical workloads for one or the other?

A.  It’s important to remember that both Fibre Channel and iSCSI are block storage protocols. That is, for applications and workloads that require block storage, both Fibre Channel and iSCSI are relevant.

From a connectivity standpoint, there is not much difference between the protocols at a high level – you have an initiator in the host, a switch in the middle, and a storage target at the other end. What becomes important, then, is topologies and architectures. Fibre Channel has a tightly-controlled oversubscription ratio, which is the number of hosts that we allow to access a single storage device (ratios can fall between, typically 4:1 to 20:1, depending on the application). iSCSI, on the other hand, has a looser relationship with oversubscription ratios, and can often be several dozen to 1 storage target.

Q. For IPSEC for iSCSI, are there hardware offload capabilities to do the encryption/decryption in iSCSI targets available, or is it all done in software?

A. Both hardware offload and software solutions are available. The tradeoffs are typically cost. With a software solution, you pay the cost in extra overhead in the CPU. If your CPU is not already busy, then that cost is very low (you may not even notice). If however, your CPU is busy, then the overhead of IPSEC will slow down your application from getting real work done. With the hardware offload solution, the cost is the extra $$ to purchase the hardware itself. On the upside, the newest CPUs offer new instructions for reducing the overhead of the software processing of various security protocols.

Chelsio’s T6 offers integrated IPSec and TLS offload. This encryption capability can be used either for data-at-rest purposes (independent of the network link), or can be used in conjunction with the iSCSI (but requires a special driver). The limitation of the special driver will be removed in the next generation.

Q. For any of the instruction participants: Are there any dedicated FC/iSCSI detailed installation guides (for dummies) you use or recommend from any vendor?

A.  No, there isn’t a single set of installation guides, as the best practices vary by storage and network vendor. Your storage or network vendor is the best place to start.

Q. If iSCSI is used in a shared mode, how is the performance?

A. Assuming this refers to sharing the link (pipe), iSCSI software and hardware implementations may be configured to utilize a portion of the aggregate link bandwidth without affecting performance.

Q. Any info on FCoE (Fibre Channel over Ethernet)?

A.  There are additional talks on FCoE available from the SNIA site:

On-demand webcasts:

Blogs:

In summary, FCoE is an encapsulation of the FC protocol into Ethernet packets that are carried over an Ethernet wire (without the use of TCP or IP).

Q. What is FC’s typical network latency in relation to storage access and compare to iSCSI?

A.  For hardware-offloaded iSCSI, the latency is essentially the same since both stacks are processed at silicon speeds.

Q. With 400Gbps Ethernet on the horizon, cloud providers and enterprises adopting Hyper-converged architectures based on Ethernet, isn’t it finally death of FC, at least in the mainstream, with exception of some niche verticals, which also still run mainframes?

A. No, tape is still with us, and its demise has been predicted for a long time. There are still good reasons for investing in FC; for example, sunk costs, traditional environments and applications, and the other advantages explained in the presentation. The above said, the ubiquity of the Ethernet ecosystem which drives features/performance/lower-cost has been and will continue to be a major challenge for FC.

And so, the FC vs. iSCSI debate continues, Ready for another “Great Debate?” So are we, register now for our next live webcast “File vs. Block vs. Object” on April 17th. We hope to see you there!

 

 

 

 

 

 

 

 

 

 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *