Expanding Your Data Center with FCoE – Q&A

At our recent live ESF Webcast, “Expert Insights: Expanding the Data Center with FCoE,” we examined the current state of FCoE and looked at how this protocol can expand the agility of the data center if you missed it, it’s now available  on-demand. We did not have time to address all the questions, so here are answers to them all. If you think of additional questions, please feel free to comment on this blog.

Q. You mentioned using 40 and 100G for inter-switch links.   Are there use cases for end point (FCoE target and initiator) 40 and 100G connectivity?

A. Today most end points are only supporting 10G, but we are starting to see 40G server offerings enter the market, and activity among the storage vendors designing these 40G products into their arrays.

Q. What about interoperability between FCoE switch vendors?

A. Each switch vendor has his own support matrix, and would need to be examined independently.

Q. Is FCoE supported on copper cable?

A. Yes, FCoE supports “Twin Ax” copper and is widely used for server to top of rack switch connections to seven meters.  In fact, Converged Network Adapters are now available that support 10GBASE-T copper cables with the familiar RJ-45 jack.   At least one major switch vendor has qualified FCoE running over 10GBASE-T to 30 meters.

Q. What distance does FCoE support?

A. Distance limits are dependent on the hardware in use and the buffering available for Priority Flow Control. The lengths can vary from 3m up to over 80km. Top of rack switches would fall into the 3m range while larger class switch/directors would support longer lengths.

Q. Can FCoE take part in management/orchestration by OpenStack Neutron?

A. As of this writing there are no OpenStack extensions in Neutron for FCoE-specific plugins.

Q. So how is this FC-BB-6 different than FIP snooping?

A. FIP Snooping is a part of FC-BB-5 (Appendix D), which allows switch devices to identify an FCoE Frame format and create a forwarding ACL to a known FCF. FC-BB-6 creates additional architectural elements for deployments, including a “switch-less” environment (VN2VN), and a distributed switch architecture with a controlling FCF. Each of these cases is independent from the other, and you would choose one instead of the others. You can learn more about VN2VN from our SNIA-ESF Webcast, “How VN2VN Will Help Accelerate Adoption of FCoE.”

Q. You mentioned DCB at the beginning of the presentation. Are there other purposes for DCB? Seems like a lot of change in the network to create a DCB environment for just FCoE. What are some of the other technologies that can take advantage of DCB?

A. First, DCB is becoming very ubiquitous. Unlike the early days of the standard, where only a few switches supported it, today most enterprise switches support DCB protocols. As far as other use cases for DCB, iSCSI benefits from DCB, since it eliminates dropped packets and the TCP/IP protocol’s backoff algorithm when packets are dropped, smoothing out response time for iSCSI traffic. There is a protocol known as RoCE or RDMA over Converged Ethernet. RoCE requires the lossless fabric DCB creates to achieve consistent low latency and high bandwidth.   This is basically the InfiniBand API running over Ethernet. Microsoft’s latest version of file serving protocol, SMB Direct, and the Hyper-V Live Migration can utilize RoCE, and there is an extension to iSCSI known as iSER, which replaces TCP/IP with RDMA for the iSCSI datamover; enabling all iSCSI reads and writes to be done as RDMA operations using RoCE.

Q. Great point about RoCE.   iSCSI RDMA (iSER) is required from DCB if the adapters support RoCE, right?

A. Agreed. Please see the answer above to the DCB question.

Q. Did that Boeing Aerospace diagram still have traditional FC links, and if yes, where?

A. There was no Fibre Channel storage attached in that environment. Having the green line in the ledger was simply to show that Fibre Channel would have it’s own color should there be any links.

Q. What is the price of a 10 Gbp CNA compare to a 10Gbps NIC ?

A. Price is dependent on vendor and economics. But, there are several approaches to delivering the value of FCoE which can influence pricing:

  • Purpose built silicon that offloads the FC and Ethernet protocol functions offer a number of advantages including high performance, low CPU overhead, advanced features, etc., though even this depends on the vendor’s implementation.    But, these added features come with the expectation of additional cost. But, the processing of the protocols has to be done somewhere, and if you need your server CPUs to process applications instead of network protocols, then the value is justified.
  • With the introduction of Open FCoE drivers with DCB supported NICs, new options are available for customers to deploy the value of FCoE at the host. Open FCoE offloads the FC processing onto the host CPU and standard 10GbE NICs with DCB support can be used to manage the Ethernet transport functions. Where you have excess CPU capacity on your server, you might be in a position to reduce costs and deploy a software driver with  a 10GbE or faster NIC enhanced with the limited set of hardware offloads necessary to achieve full performance with Open FCoE. However, Open FCoE isn’t available with every OS or every NIC, so you need to consider OS support and availability.
  • A third consideration is that most enterprise servers include some form of advanced 10GbE networking on the motherboard that either supports purpose built silicon or DCB enabled silicon. So, depending upon which server and OS you deploy, you may have several options via embedded silicon.

 

How DCB Makes iSCSI Better

A challenge with traditional iSCSI deployments is the non-deterministic nature of Ethernet networks. When Ethernet networks only carried non-storage traffic, lost data packets where not a big issue as they would get retransmitted. However; as we layered storage traffic over Ethernet, lost data packets became a “no no” as storage traffic is not as forgiving as non-storage traffic and data retransmissions introduced I/O delays which are unacceptable to storage traffic. In addition, traditional Ethernet also had no mechanism to assign priorities to classes of I/O.

Therefore a new solution was needed. Short of creating a separate Ethernet network to handle iSCSI storage traffic, Data Center Bridging (DCB), was that solution.

The DCB standard is a key enabler of effectively deploying iSCSI over Ethernet infrastructure. The standard provides the framework for high-performance iSCSI deployments with key capabilities that include:
– Priority Flow Control (PFC)—enables “lossless Ethernet”, a consistent stream of data between servers and storage arrays. It basically prevents dropped frames and maximizes network efficiency. PFC also helps to optimize SCSI communication and minimizes the effects of TCP to make the iSCSI flow more reliably.
– Quality of Service (QoS) and Enhanced Transmission Selection (ETS)—support protocol priorities and allocation of bandwidth for iSCSI and IP traffic.
– Data Center Bridging Capabilities eXchange (DCBX) — enables automatic network-based configuration of key network and iSCSI parameters.

With DCB, iSCSI traffic is more balanced over high-bandwidth 10GbE links. From an investment protection perspective, the ability to support iSCSI and LAN IP traffic over a common network makes it possible to consolidate iSCSI storage area networks with traditional IP LAN traffic networks. There is also another key component needed for iSCSI over DCB. This component is part of Data Center Bridging eXchange (DCBx) standard, and it’s called TCP Application Type-Length-Value, or simply “TLV”! TLV allows the DCB infrastructure to apply unique ETS and PFC settings to specific sub-segments of the TCP/IP traffic. This is done through switches which can identify the sub-segments based on their TCP socket or port identifier which are included in the TCP/IP frame. In short, TLV directs servers to place iSCSI traffic on available PFC queues, which separates storage traffic from other IP traffic. PFC also eliminates data retransmission and supports a consistent data flow with low latency. IT administrators can leverage QoS and ETS to assign bandwidth and priority for iSCSI storage traffic, which is crucial to support critical applications.

Therefore, depending on your overall datacenter environment, running iSCSI over DCB can improve:
– Performance by insuring a consistent stream of data, resulting in “deterministic performance” and the elimination of packet loss that can cause high latency
– Quality of service through allocation of bandwidth per protocol for better control of service levels within a converged network
– Network convergence

For more information on this topic or technologies discussed in this blog, please visit some of our other blog articles:
What Up with DCBX Blog  and iSCSI over DCB: Reliability and predictable performance  or check out the IEEE website on DCB

What Up with DCBX?

I guess this is a blog that could either be very short or very long… The full name of the protocol – Data Center Bridging capability eXchange (DCBX) basically tells you all you need to know or maybe nothing at all. At its simplest, DCBX does what it says on the tin and the way it is in effect used is no more or less than the DCB auto negotiation capability to make sure that the data center network is correctly and consistently configured. It is important to note that technically you can debate if this is an auto negotiation protocol or not, but in reality that’s how it is actually used.

Now it is important to note that there are many misnomers around DCB itself. Let’s remember that DCB is actually a group within IEEE responsible for many separate standards – basically anything for Ethernet (or as IEEE say bridging) that is assumed to be specific to the data center. Currently, discussed are those standards and protocols related to I/O Convergence (PFC, ETS, QCN, DCBX) and those related to server virtualization (Virtual Ethernet Port Aggregator or VEPA and others). So in essence the intent of DCBX is to help two adjacent devices share information about how these protocols are, or need to be, configured. DCBX actually does this by leveraging good old LLDP – just as PFC, ETS and QCN leverage 802.1p. What is particularly nice though is that DCBX not only allows the simple exchange of information around the DCB protocols themselves but also around how upper level protocols might want to use the DCB layer.

This brings us nicely to a very critical point – like most things in this area, DCBX purely works at the link level to allow a pair of connected ports (node to switch or switch to switch) to exchange their specific port configuration. This is an important point as in a multi-hop environment you need to keep in mind that every link may successfully complete its DCBX negotiation but unless some higher level intelligence (you) ensures that things are set right on each and every link, you may still not be meeting the needs of an end-to-end traffic flow. Even in a simple case of device-switch-switch-device I could have Fibre Channel over Ethernet (FCoE) negotiated on the first device-switch and last switch-device connection, and nothing configured on the intermediate switch-switch connection – and the two FCoE end points would happily talk to each other thinking that they have end-to-end lossless connectivity. In a more complex scenario let’s also remember that many L2/L3 switches have not just the ability to route between L2 domains, but also have the ability to reclassify traffic from one 802/1p priority to another. For this reason it is often simpler to use DCB to support 8 independent forwarding planes across the data center as this means we can simply configure all ports pretty much identically. I believe the term here around being clever is ‘here be dragons’.

Anyone that has spent a little time with DCB or FCoE will actually know that DCBX doesn’t just help at the level of the layer 2 protocols, but also helps at the level of the actual upper level protocol we care about. Most well known is that DCBX can carry specific exchanges to ensure the correct configuration of DCB to support FCoE and many people may be aware that it can do the same for iSCSI as well. Far less known however is that these two examples of setting up DCB for upper level protocols are in fact just that – examples. DCBX actually has a generic application type-length-value (TLV) format whereby you can specify what you would like for any upper level protocol that can be identified by either Ethertype or IP socket. Thus DCBX has like the rest of DCB been carefully architected to support the full broad needs of I/O and network convergence and not just the needs of storage convergence. DCBX as a protocol allows you to have an NFS Application TLV, an SMB Application TLV, a RDMA over Converged Ethernet or RoCE Application TLV, iWarp Application TLV, an SNMP Application TLV – etc.

A final and very practical point that any article on DCBX needs to cover is that we are in an evolving world and there are multiple different, and indeed incompatible, versions of DCBX available. Just reviewing the common DCB equipment available today you need to consider DCBX 1.0 as used by pre-standards FCoE products, DCBX 1.01 sometimes referred to as the Converged Enhanced Ethernet (CEE) or baseline version as found most commonly on shipping products today, and DCBX IEEE as actually defined in the standards (physically mostly contained within the ETS standard). It is also important to note that while some products have mechanisms to auto discover and select which version of DCBX to use, there is in fact no standard for such mechanisms. In this case the term is I assume ‘caviat emptor – buyer beware’.

All that said, maybe I should have started this blog reminding everyone that the I/O convergence parts of DCB are not just about allowing storage traffic to be mixed with non-storage traffic without fate sharing problems, but is actually about collapsing the multiple networks of different networks into a single network. I believe the average server is said to have about 6 NICs’ today? As such in the 10GbE and up Ethernet world, the full capabilities of DCBX really are a critical enabler for simplifying the operation of the modern converged virtualized data center.

Ethernet Storage Forum – 2012 Year in Review and What to Expect in 2013

As we come to a close of the year 2012, I want to share some of our successes and briefly highlight some new changes for 2013. Calendar year 2012 has been eventful and the SNIA-ESF has been busy. Here are some of our accomplishments:

  • 10GbE – With virtualization and network convergence, as well as the general availability of LOM and 10GBASE-T cabling, we saw this is a “breakout year” for 10GbE. In July, we published a comprehensive white paper titled “10GbE Comes of Age.” We then followed up with a Webcast “10GbE – Key Trends, Predictions and Drivers.” We ran this live once in the U.S. and once in the U.K. and combined, the Webcast has been viewed by over 400 people!
  • NFS – has also been a hot topic. In June we published a white paper “An Overview of NFSv4″ highlighting the many improved features NFSv4 has over NFSv3. A Webcast to help users upgrade, “NFSv4 – Plan for a Smooth Migration,” has also been well received with over 150 viewers to date.   A 4-part Webcast series on NFS is now planned. We kicked the series off last month with “Reasons to Start Working with NFSv4 Now” and will continue on this topic during the early part of 2013. Our next NFS Webcast will be “Advances in NFS – NFSv4.1 and pNFS.” You can register for that here.
  • Flash – The availability of solid state devices based on NAND flash is changing the performance efficiencies of storage. Our September Webcast “Flash – Plan for the Disruption” discusses how Flash is driving the need for 10GbE and has already been viewed by more than 150 people.

We have also added to expand membership and welcome new membership from Tonian and LSI to the ESF. We expect with this new charter to see an increase in membership participation as we drive incremental value and establish ourselves as a leadership voice for Ethernet Storage.

As we move into 2013, we expect two hot trends to continue – the broader use of file protocols in datacenter applications, and the continued push toward datacenter consolidation with the use of Ethernet as a storage network. In order to better address these two trends, we have modified our charter for 2013. Our NFS SIG will be renamed the File Protocol SIG and will focus on promoting not only NFS, but also SMB / CIFS solutions and protocols. The iSCSI SIG will be renamed to the Storage over Ethernet SIG and will focus on promoting data center convergence topics with Ethernet networks, including the use of block and file protocols, such as NFS, SMB, FCoE, and iSCSI, over the same wire. This modified charter will allow us to have a richer conversation around storage trends relevant to your IT environment.

So, here is to a successful 2012, and excitement for the coming year.

10GbE Answers to Your Questions

Our recent Webcast: 10GbE – Key Trends, Drivers and Predictions was very well received and well attended. We thank everyone who was able to make the live event. For those of you who couldn’t make it, it’s now available on demand. Check it out here.

There wasn’t enough time to respond to all of the questions during the Webcast, so we have consolidated answers to all of them in this blog post from the presentation team.   Feel free to comment and provide your input.

Question: When implementing VDI (1000 to 5000 users) what are best practices for architecting the enterprise storage tier and avoid peak IOPS / Boot storm problems?   How can SSD cache be used to minimize that issue?  

Answer: In the case of boot storms for VDI, one of the challenges is dealing with the individual images that must be loaded and accessed by remote clients at the same time. SSDs can help when deployed either at the host or at the storage layer. And when deduplication is enabled in these instances, then a single image can be loaded in either local or storage SSD cache and therefore it can be served the client much more rapidly. Additional best practices can include using cloning technologies to reduce the space taken up by each virtual desktop.

Question: What are the considerations for 10GbE with LACP etherchannel?

Answer: Link Aggregation Control Protocol (IEEE 802.1AX-2008) is speed agnostic.   No special consideration in required going to 10GbE.

Question: From a percentage point of view what is the current adoption rate of 10G Ethernet in data Centers vs. adoption of 10G FCoE?

Answer: As I mentioned on the webcast, we are at the early stages of adoption for FCoE.   But you can read about multiple successful deployments in case studies on the web sites of Cisco, Intel, and NetApp, to name a few.   The truth is no one knows how much FCoE is actually deployed today.   For example, Intel sells FCoE as a “free” feature of our 10GbE CNAs.   We really have no way of tracking who uses that feature.   FC SAN administrators are an extraordinarily conservative lot, and I think we all expect this to be a long transition.   But the economics of FCoE are compelling and will get even more compelling with 10GBASE-T.   And, as several analysts have noted, as 40GbE becomes more broadly deployed, the performance benefits of FCoE also become quite compelling.

Question: What is the difference between DCBx Baseline 1.01 and IEEE DCBx 802.1 Qaz?

Answer: There are 3 versions of DCBX
– Pre-CEE (also called CIN)
– CEE
– 802.1Qaz

There are differences in TLVs and the ways that they are encoded in all 3 versions.  Pre-CEE and CEE are quite similar in terms of the state machines.  With Qaz, the state machines are quite different — the
notion of symmetric/asymmetric/informational parameters was introduced because of which the way parameters are passed changes.

Question: I’m surprise you would suggest that only 1GBe is OK for VDI??   Do you mean just small campus implementations?   What about multi-location WAN for large enterprise if 1000 to 5000 desktop VMs?

Answer: The reference to 1GbE in the context of VDI was to point out that enterprise applications will also rely on 1GbE in order to reach the desktop. 1GbE has sufficient bandwidth to address VoIP, VDI, etc… as each desktop connects to the central datacenter with 1GbE. We don’t see a use case for 10GbE on any desktop or laptop for the foreseeable future.

Question: When making a strategic bet as a CIO/CTO in future (5-8 years plus) of my datacenter, storage network etc, is there any technical or biz case to keep FC and SAN?   Versus, making move to 10/40GbE path with SSD and FC?   This seems especially with move to Object Based storage and other things you talked about with Big Data and VM?   Seems I need to keep FC/SAN only if vendor with structured data apps requires block storage?

Answer: An answer to this question really requires an understanding of the applications you run, the performance and QOS objectives, and what your future applications look like. 10GbE offers the bandwidth and feature set to address the majority of application requirements and is flexible enough to support both file and block protocols. If you have existing investment in FC and aren’t ready to eliminate it, you have options to transition to a 10GbE infrastructure with the use of FCoE. FCoE at its core is FCP, so you can connect your existing FC SAN into your new 10GbE infrastructure with CNAs and switches that support both FC and FCoE. This is one of the benefits of FCoE – it offers a great migration path from FC to Ethernet transports. And you don’t have to do it all at once. You can migrate your servers and edge switches and then migrate the rest of your infrastructure later.

Question: Can I effectively emulate or out-perform SAN on FC, by building VLAN network storage architecture based on 10/40GBe, NAS, and use SSD Cache strategically.

Answer: What we’ve seen, and you can see this yourself in the Yahoo case study posted on the Intel website, is that you can get to line rate with FCoE.   So 10GbE outperforms 8Gbps FC by about 15% in bandwidth.   FC is going to 16 Gbps, but Ethernet is going to 40Gbps.   So you should be able to increasingly outperform FC with FCoE — with or without SSDs.

Question: If I have large legacy investment in FC and SAN, how do cost-effectively migrate to 10 or 40 GBe using NAS?   Does it only have to be greenfield opportunity? Is there better way to build a business case for 10GBe/NAS and what mix should the target architecture look like for large virtualized SAN vs. NAS storage network on IP?

Answer: The combination of 10Gb converged network adapter (CNA) and a top of the rack (TOR) switch that supports both FCoE and native FC allows you to preserve connectivity to your existing FC SAN assets while taking advantage of putting in place a 10Gb access layer that can be used for storage and IP.   By using CNAs and DCB Ethernet switches for your storage and IP access you are also helping to reduce your CAPEX and OPEX (less equipment to buy and manage using a common infrastructure.   You get the added performance (throughput) benefit of 10G FCoE or iSCSI versus 4G or 8G Fibre Channel or 1GbE iSCSI.   For your 40GbE core switches so you have t greater scalability to address future growth in your data center.

Question: If I want to build an Active-Active multi-Petabyte storage network over WAN with two datacenters 1000 miles apart to primarily support Big Data analytics,   why would I want to (..or not) do this over 10/40GBe / NAS vs.   FC on SAN?   Does SAN vs. NAS really enter into the issue?   If I got mostly file-based demand vs. block is there a tech or biz case to keep SAN ?

Answer: You’re right, SAN or NAS doesn’t really enter into the issue for the WAN part; bandwidth does for the amount of Big Data that will need to be moved, and will be the key component in building active/active datacenters. (Note that at that distance, latency will be significant and unavoidable; applications will experience significant delay if they’re at site A and their data is at site B.)

Inside the data center, the choice is driven by application protocols. If you’re primarily delivering file-based space, then a FC SAN is probably a luxury and the small amount of block-based demand can be delivered over iSCSI with equal performance. With a 40GbE backbone and 10GbE to application servers, there’s no downside to dropping your FC SAN.

Question: Are you familiar with VMware and CISCO plans to introduce a Beat for virtualized GPU Appliance (aka think Nivdia hardware GPUs) for heavy duty 3D visualization apps on VDI?  No longer needing expensive 3D workstations like RISC based SGI desktops. If so, when dealing with these heavy duty apps what are your concerns for network and storage network?

Answer: I’m afraid I’m not familiar with these plans.   But clearly moving graphics processing from the client to the server will add increasing load to the network.   It’s hard to be specific without a defined system architecture and workload.   However, I think the generic remarks Jason made about VDI and how NVM storage can help with peak loads like boot storms apply here as well, though you probably can’t use the trick of assuming multiple users will have a common image they’re trying to access.

Question: How do I get a copy of your slides from today?   PDF?

Answer: A PDF of the Webcast slides is available at the SNIA-ESF Website at: http://www.snia.org/sites/default/files/SNIA_ESF_10GbE_Webcast_Final_Slides.pdf