iSCSI is one of the most broadly supported storage protocols, but traditionally has not been associated with the highest performance. Newer protocols like iSER and NVMe over Fabrics promise extreme performance but are still maturing and lack the broad feature and platform support of iSCSI. Storage vendors and customers face interesting tradeoffs and options when evaluating how to achieve the highest block storage performance on Ethernet networks, while preserving the major software and hardware investment in iSCSI.
iSCSI
With support from all the major storage vendors, as well as from a host of Tier 2 and Tier 3 storage players, Internet Small Computer System Interface or iSCSI is the most mature and widely supported storage protocol available. At the server level, Microsoft provides an iSCSI initiator for Windows Server environments, making configuration relatively simple. Almost all other major server operating systems and hypervisors support iSCSI natively, too. Where performance is required, dedicated iSCSI initiator adapters can be used to reduce server or storage CPU load. Per IDC, iSCSI represents a $3.6B projected system revenue TAM in 2017[1].
Beyond storage and operating system support, the iSCSI protocol has benefited from its ease of adoption. The protocol uses standard 1/10/25/40/50/100 Gigabit Ethernet transport, is transmitted using TCP/IP, which can often help simplify the implementation and operational requirements of large environments.
The performance advantages of iSCSI are compelling. Storage arrays with 10/25/40/50/100G iSCSI adapters (scalable to 200/400+ Gb) are now available, and with the use of iSCSI offload adapters easily capable of keeping up with new server multicore processors and many iSCSI hardware initiators that can generate hundreds of thousands of application-level IOPS[2].
For enterprise customers, iSCSI offers three distinct advantages. First, iSCSI represents a SAN protocol with a built-in “second source:” in the form of a software-only solution that runs on any Ethernet NIC. Replacing the iSCSI offload adapter with a software solution will result in lower performance, but allows customers to choose a different price/performance ratio that meets their needs. Second, use of the iSCSI allows the host and storage systems to run the protocol in hardware or, thus decoupling the upgrade cycle of various pieces of server and storage hardware from each other. Third, the iSCSI software initiator is the most widely supported in-box SAN capability among all OS vendors.
Traditionally seen as a solution for many small and medium enterprise organizations, iSCSI support of 25/40/50/100 Gigabit Ethernet transport and the pervasive TCP/IP protocol makes it a natural fit for storage networking for the most demanding of enterprise environments, for private and public cloud communications, and across wide-area networks (WANs).
iSER
The iSCSI Extensions for RDMA (iSER) protocol is an iSCSI variant that takes advantage of RDMA fabric hardware technologies to enhance performance. It is effectively a translation layer that translates iSCSI to RDMA transactions for operation over Ethernet RDMA transports such as iWARP RDMA and RDMA over Converged Ethernet (RoCE), as well as non-Ethernet transports including InfiniBand or OmniPath Architecture. iSER iWARP/RoCE generally does not support software stacks (except by using soft iWARP or soft RoCE) and optimally requires RDMA-enabled 10/25/40/50/100GbE RDMA offload hardware within both the server initiators and target systems for performance enhancement. iSER end-nodes can only communicate with other iSER end nodes and interoperation with iSCSI end-nodes requires disabling iSER extensions, and losing hardware offload support and performance benefits.
One of the caveats of iSER is that end-nodes can only interoperate with other end-nodes supporting the same underlying fabric variant; for example, iSER RoCE initiators can only interoperate with iSER RoCE targets and not with iSER iWARP end-nodes. iSER variants also inherit the features of the underlying RDMA transport. Thus, iSER on RoCE requires Ethernet adapters capable of supporting RoCE, which requires lossless Ethernet or PFC and, optionally, ECN for multi-switch environments. While iSER on iWARP through its use of TCP/IP can run both within data centers as well as across metropolitan area networks (MANs) and WANs, wherever standard TCP/IP is supported.
Regarding performance, however, thanks to hardware offload and direct data placement (DDP), iSER provides improved performance efficiencies and lower CPU utilization compared to iSCSI software implementations.
Comparing iSCSI and iSER
Despite the similarities and origin of its name, iSER is incompatible with iSCSI, and iSER storage systems must fall back to standard iSCSI to interoperate with the very large existing iSCSI installed base. Implementations of iSER transport also differ, for instance, as noted above iSER requires a like-for-like approach to RDMA offload adapters on both ends of a link, but iSCSI target-mode offload implementations are fully interoperable with software initiator peers. The net result is that iSCSI provides the option to mix hardware initiators for comparable application-level performance to using iSER. For the current generation of solid-state drives (SSD’s), hardware offloaded iSCSI and iSER provide about the same level of CPU utilization and throughput [3].
NVMe over Fabrics (NVMe-oF)
Non-Volatile Memory Express (NVMe) is an optimized direct-attach protocol for host communication with flash-based native PCIe devices, such as SSDs. NVMe over Fabrics (NVMe-oF) is a technology specification designed to enable NVMe message-based commands to transfer data between a host computer and a target solid-state storage device or system over a network such as Ethernet (RoCE and iWARP), Fibre Channel, InfiniBand, and OmniPath.
As with iSER, NVMe-oF Ethernet RDMA end-nodes can only interoperate with other NVMe-oF Ethernet end-nodes supporting the same Ethernet RDMA transport, such as iWARP-to-iWARP or RoCE-to-RoCE. In addition, NVMe-oF end-nodes cannot interoperate with iSCSI or iSER end-nodes. Regarding network requirements, NVMe-oF on RoCE requires switches capable of DCB functions (e.g., ETS/PFC or ECN), while NVMe-oF on iWARP can run over any switches supporting TCP/IP, both within data centers, as well as across MANs and WANs, or even over wireless links.
While iSCSI is an established market with a broad-based ecosystem enabling high volume shipments, as of this writing NVMe over Fabrics is largely still in the proof-of-concept phase as many necessary component standards, such as drivers, are nearing finalization.
Nevertheless, there is a large community working to develop NVMe over Fabrics. As the non-volatile memory-based storage solution space matures, NVMe-oF will become more attractive.
Putting it all together
iSCSI is a well-established and mature storage networking solution, supported by many arrays, initiators, and nearly all operating systems and hypervisors. Performance can be enhanced with hardware-accelerated iSCSI adapters on the initiator or target. The iSCSI ecosystem continues to evolve by adding support for higher speeds up to 100GbE and with growing support for iSER as a way to deliver iSCSI over RDMA transports.
At the same time, NVMe-oF presents enterprise end-users with a major challenge: how to preserve the major software and hardware investment in iSCSI while considering other storage protocols. Similarly, enterprise storage system vendors require continued development of iSCSI storage product lines (including offering higher-performance 100GbE networking) while evolving to support other storage protocols and infrastructures. Immediate performance challenges can be addressed using hardware offloaded 100G iSCSI. In the longer-term, storage vendors are likely to address these challenges through concurrent support of iSCSI, iSER, and NVMe-oF for evolutionary, non-disruptive deployment of next-generation flash storage systems.”
[1] IDC WW Quarterly Disk Storage Systems Forecast, June, 2015
[2] See, for example, P7 (SQL Server 2016 OLTP IOPS) of Evaluation of the Chelsio T580-CR iSCSI Offload adapter, ©2016, Demartek
[3] See, for example, “iSCSI or iSER”, 2015 SNIA Storage Developer Conference (pages 28-29)
One thought to “Comparing iSCSI, iSER, and NVMe over Fabrics (NVMe-oF): Ecosystem, Interoperability, Performance, and Use Cases”