Hyperconverged infrastructures (also known as “HCI”) are designed to be easy to set up and manage. All you need to do is add networking. In practice, the “add networking” part has been more difficult than most anticipated. That’s why the SNIA Networking Storage Forum (NSF) hosted a live webcast “The Networking Requirements for Hyperconverged Infrastructure” where we covered what HCI is, storage characteristics of HCI, and important networking considerations. If you missed it, it’s available on-demand.
We had some interesting questions during the live webcast and as we promised during the live presentation, here are answers from our expert presenters:
Q. An HCI configuration ought to exist out of 3 or more nodes, or have I misunderstood this? In an earlier slide I saw HCI with 1 and 2 nodes.
A. You are correct that HCI typically requires 3 or more nodes with resources pooled together to ensure data is distributed through the cluster in a durable fashion. Some vendors have released 2 node versions appropriate for edge locations or SMBs, but these revert to a more traditional failover approach between the two nodes rather than a true HCI configuration.
Q. NVMe-oF means running NVMe over Fibre Channel or something else?
A. The “F” in “NVMe-oF” stands for “Fabrics”. As of this writing, there are currently 3 different “official” Fabric transports explicitly outlined in the specification: RDMA-based (InfiniBand, RoCE, iWARP), TCP, and Fibre Channel. HCI, however, is a topology that is almost exclusively Ethernet-based, and Fibre Channel is a less likely storage networking transport for the solution.
The spec for NVMe-oF using TCP was recently ratified, and may gain traction quickly given the broad deployment of TCP and comfort level with the technology in IT. You can learn ore about NVMe-oF in the webinar “Under the Hood with NVMe over Fabrics” and NVMe/TCP in this NSF webcast “What NVMe™/TCP Means to Networked Storage.”
Q. In the past we have seen vendors leverage RDMA within the host but not take it to the fabric i.e. RDMA yes, RDMA over fabric may be not. Within HCI, do you see fabrics being required to be RDMA aware and if so, who do you think will ultimately decide this, HCI vendor, applications vendor, the customer, or someone else?
A. The premise of HCI systems is that there is an entire ecosystem “under one roof,” so to speaker. Vendors with HCI solutions on the market have their choice of networking protocols that best works with their levels of virtualization and abstraction.
To that end, it may be possible that RDMA-capable fabrics will become more common as workload demands on the network increase, and IT looks for various ways to optimize traffic. Hyperconverged infrastructure, with lots of east-west traffic between nodes, can take advantage of RDMA and NVMe-oF to improve performance and alleviate certain bottlenecks in the solution. It is, however, only one component piece of the overall picture. The HCI solution needs to know how to take advantage of these fabrics, as do switches, etc. for an end-to-end solution, and in some cases other transport forms may be more appropriate.
Q. What is a metadata network? I had never heard that term before.
A. Metadata is the data about the data. That is, HCI systems need to know where the data is located, when it was written, how to access it. That information about the data is called metadata.
As systems grow over time, the amount of metadata that exists in the system grows as well. In fact, it is not uncommon for the metadata quantity and traffic to exceed the data traffic. For that reason, some vendors recommend establishing a completely separate network for handling the metadata traffic that traverses the system.
You have covered some very impressive questions in your webinar. Companies who are selling these hyperconverged appliances never talk about the requirement of metadata network when you are scaling up and couldn’t have agreed more about how hyperconverged allows removing of storage bottlenecks. A very nice article to read, thanks for sharing.