SMB3 – These Questions Rock!

Earlier this month, the SNIA Ethernet Storage Forum hosted a live webcast on Server Message Block (SMB), “Rockin’ and Rollin’ with SMB3.” Presenting was Ned Pyle, Microsoft SMB Program Manager. If you missed the live event, I encourage you to watch it on-demand. We had a lot of questions from the big audience this event drew, so as promised, here are answers to them all.

Q. Other than that audit setup, is there a way to determine, via the OS, which SMB version is in use?

A. No. Network captures alone will tell you, but Windows doesn’t track this explicitly other than SMB1 with auditing we added specifically for the task of identifying removal options

Q. SMB 3.1.1 over Ethernet… can you discuss/compare with SMB 3.1.1 over Infiniband?

A. If the question is ‘what’s better, Infiniband or Ethernet’, my answer is always: it depends. I really don’t want to get into a competitive conversation under the guides of SNIA. I simply recommend looking at the vendor stories and make an informed decision. Overall, Ethernet/TCP/IP versions like RoCE and iWARP configurations are generally less expensive than Infiniband ones. They all have tremendous performance. They all have their various ups and downs.

Q. Do you have statistics regarding SMB-Direct adoption?

A. It’s tricky, as our telemetry for Server usage is quite inaccurate due to firewall rules preventing servers from reaching the Internet. I can say indirectly that we know of thousands of customer deployments.

Q. What’s the name of the IO application?

A. DiskSPD

Q. I don’t believe your I/O data tests, wouldn’t you need to trunk 17 10 Gigabit Network Cards to achieve 168 gigabit I/O capability?

A.  This was a misunderstanding, you thought I said 10Gb but it was 100Gb. We used 100Gb RDMA NICs in this demo with RoCEv2. The bottleneck was the storage at that point, the network had plenty of bandwidth left over.  

Q. These are great, but how many of these new features will end up locking out FOSS/GPL implementations of SMB such as SAMBA?

A. Absolutely not! We work with Samba team and Linux to ensure that SMB can be broadly deployed with all of its capabilities inside open source software.

Q. NetApp supports CA shares (which uses transparent failover) in two use cases: SQL over SMB and Hyper-V over SMB3.

A. This sounds likes someone from NetApp stating a fact, so I will simply say “good!” 🙂

Q.  Can you please post links to the tools mentioned in this presentation, and I/O tests? Is there a comparison using I/O Meter?

A. Here you go:

  • https://gallery.technet.microsoft.com/DiskSpd-a-robust-storage-6cd2f223
  • https://github.com/Microsoft/diskspd
  • https://github.com/Microsoft/diskspd/tree/master/Frameworks/VMFleet

Q. You are forced to use SMB1 because of the Windows 2003 issue?

A. Windows Server 2003 and XP (and older, like Win2000) all use SMB1. If they are still around, you will need to leave SMB1 enabled on any machines talking to them.

Q. When will Microsoft officially drop support for SMB1?

A. Overall for the protocol, there is no timeline. It is deprecated however, so no further work will be done in SMB1 other than critical security patches. SMB1 will start being removed *by default* in a coming release of Windows Server and Windows 10 client. This doesn’t mean totally removed forever, but instead “missing by default”, where you must directly opt in to adding it back. It will be done on a per-SKU basis, so that enterprises are first likely to see it, since they are equipped better to understand it and less likely to need SMB1

Q. Is there a way to change block size in SMB3 ?

A. In SMB2_READ processing section 3.3.5.12 (https://msdn.microsoft.com/en-us/library/cc246729.aspx):

The server SHOULD<296> fail the request with STATUS_INVALID_PARAMETER if the Length field is greater than Connection.MaxReadSize.

If Connection.SupportsMultiCredit is TRUE the server MUST validate CreditCharge based on Length, as specified in section 3.3.5.2.5. If the validation fails, it MUST fail the read request with STATUS_INVALID_PARAMETER.

There is similar text for SMB2_WRITE in 3.3.5.13 (https://msdn.microsoft.com/en-us/library/cc246730.aspx).

Then, off to SMB2_NEGOTIATE  in 3.3.5.4 (https://msdn.microsoft.com/en-us/library/cc246768.aspx) to discover:

  • MaxReadSize is set to the maximum size, in bytes, of the Length in an SMB2 READ Request (section 2.2.19) that the server will accept on the transport that established this connection. This value SHOULD<231> be greater than or equal to 65536. MaxReadSize MUST be set to MaxReadSize.
  • MaxWriteSize is set to the maximum size, in bytes, of the Length in an SMB2 WRITE Request (section 2.2.21) that the server will accept on the transport that established this connection. This value SHOULD<232> be greater than or equal to 65536. MaxWriteSize MUST be set to MaxWriteSize.
Windows version\Connection.Dialect 2.0.2 All other SMB2 dialects
Windows Vista SP1\Windows Server 2008 65536 N/A
Windows 7\Windows Server 2008 R2 65536 1048576
Windows 8 without [MSKB-2934016]\Windows Server 2012 without [MSKB-2934016] 65536 1048576
All other SMB2 servers 65536 8388608

<232> Section 3.3.5.4: If the underlying transport is NETBIOS over TCP, Windows servers set MaxWriteSize to 65536. Otherwise, MaxWriteSize is set based on the following table.

Windows version\Connection.Dialect 2.0.2 All other SMB2 dialects
Windows Vista SP1\Windows Server 2008 65536 N/A
Windows 7\Windows Server 2008 R2 65536 1048576
Windows 8 without [MSKB-2934016]\Windows Server 2012 without [MSKB-2934016] 65536 1048576
All other SMB2 servers 65536 8388608

Update: If you missed the live event, it’s now available  on-demand. You can also  download the webcast slides.

Rock n’ Roll with SMB3

Server Message Block (SMB) is the core file-transfer protocol of Windows, MacOS and Samba, and has become widely deployed. It’s ubiquitous – a 30-year-old family of network code.

However, the latest iteration of SMB3 is almost unrecognizable when compared to versions only a few years old. That’s why the SNIA Ethernet Storage Forum (ESF) has invited Microsoft’s Ned Pyle, program manager of the SMB protocol, to speak at our live webcast, “Rockin’ and Rollin’ with SMB3.”

Extensive reengineering has led to advanced capabilities that include multichannel, transparent failover, scale out, and encryption. SMB Direct makes use of RDMA networking, creates block transport system and provides reliable transport to zetabytes of unstructured data, worldwide.

SMB3 forms the basis of hyperconverged and scale-out systems for virtualization and SQL Server. It is available for a variety of hardware devices, from printers, network-attached storage appliances, to Storage Area Networks. It is often the most prevalent protocol on a network, with high-performance data transfers as well as efficient end-user access over wide-area connections. Register now for the live event on April 5th to hear:

  • Brief background on SMB
  • An overview of the SMB 3.x family, first released with Windows 8, Windows Server 2012, MacOS 10.10, Samba 4.1, and Linux CIFS 3.12
  • What changed in SMB 3.1.1
  • Understanding SMB security, scenarios, and workloads
  • The deprecation and removal of the legacy SMB1 protocol
  • How SMB3 supports hyperconverged and scale-out storage

This is a unique opportunity to “rock out” with an SMB3 expert on the front lines at Microsoft. We hope to see you on April 5th.

Update: If you missed the live event, it’s now available  on-demand. You can also  download the webcast slides.

Benefits of RDMA in Accelerating Ethernet Storage Q&A

At our recent live Webcast “Benefits of RDMA in Accelerating Ethernet Storage Connectivity” experts from Emulex, Intel and Microsoft had an insightful discussion on the ways RDMA is having an impact on Ethernet storage. The live event was attended by nearly 200 people and feedback was overwhelming positive with several attendees thanking us for our vendor neutral presentation and one attendee commenting that it was, “Probably the most clearly comprehensible yet comprehensive webinar I’ve attended in some time.” If you missed the Webcast, it’s now available on demand. We did not have time to get to everyone’s questions, so as promised, below are answers to all of them. If you have additional questions, please ask them in the comments section in this blog and we’ll get back to you as soon as possible.

Q.  Is RDMA over RoCEv2 in production?

A. The IBTA released the RoCEv2 Specification in September 2014.  In order to support that specification changes may be required across the RDMA stack, including firmware, drivers & operating systems.   Schedules for implementation of that specification will vary by operating system.   For example, the OpenFabrics Alliance (OFA) has not released an Open Fabrics Enterprise Distribution (OFED) version that implements that standard yet, although it is in process now.   Once OFA completes their OFED stack implementation, the Linux distribution vendors will then incorporate and support the updated OFED stack.   Implementations provided prior to full OFA and Distro vendor support would be preliminary, potentially incompatible with the OFED release, and require confirmation by the distro vendor with regard to the nature/level of support they would be providing

Q. I would have liked a list of Windows applications that take advantage of SMB Direct – both in a Hyper-V host or bare metal.

A.  In Windows, any file-based application can make use of SMB3 and SMB Direct due to the native file-based programming interface support. No application changes are required. For certain enterprise applications such as Hyper-V and SQL Server, SMB3 is officially supported, and more information can be found in the product catalog at www.microsoft.com.

Q. Are there any particular benefits in using one network protocol over another for SMB Direct/RDMA (iWARP vs. RoCE vs. IB)?

A.  There are no hard and fast rules; any adapter or protocol can be suitable for many scenarios. Of the Ethernet-based protocols we considered in today’s webcast

  • iWARP offers the benefit of operation over TCP with its reliability and routability, well-suited to a broad range of installed infrastructure.
  • RoCE offers a lightweight, efficient protocol when a DCB-enabled switched fabric is available. RoCE, however, is not routable.
  • RoCEv2 offers similar properties to RoCE, with the possibility to scale to larger routed and DCB-enabled fabrics.

Q. Who are the vendors offering iWARP capable RNICs?

A. Chelsio Communications has production iWARP adapters today, and both Intel and Qlogic have publicly committed to future iWARP controllers.

Q. How much testing has been done with SMB3, and in particular SMB direct, over WAN connections?

A. The SMB2 protocol was originally designed to adapt to WAN scenarios, and supports a credit-based management of large amounts of data to be outstanding, to make best use of WAN-type long pipes. The SMB3 protocol retains these design attributes, and the SMB Direct protocol also supports similar deep pipelining. The iWARP protocol, being layered on standard TCP, is well suited to such deployments, and RoCE WAN adapters are potentially available. Please contact the respective technology vendors for information on any available testing results.

Q. I love a future webcast for RDMA enabled distributed filesystems.

A. Thanks for the suggestion! We’re always looking for ideas for future webcasts and SNIA-ESF will consider this as a potential follow-on.

Q.  Is Live Migration the scenario where “packet size” is 1MB?

A.  All SMB Direct scenarios have workloads that range anywhere up to 8MB. For large file copies, most SMB3 clients request from 1MB to 8MB per operation, for Hyper-V live migration, transfers are typically similar, during the bulk transfer phase.

Q. SMB3 is being compared to FC for enterprise. If Ethernet based protocols are of interest, wouldn’t FCoE give the same performance as FC (same stack) vs. SMB3?

A. SMB3 with SMB Direct enables many workloads not possible with Fibre Channel over Ethernet, and performance comparisons are therefore difficult. Perhaps another SNIA webcast could investigate this!

Q.  Regarding your SMB direct example with lots of small operations, how do you deal with the overhead of registering and unregistering buffers for the RDMA operations?

A. As answered later in the session, the registration and unregistration is not a protocol matter, but in the case of the Windows implementation, it is strictly performed for the specific buffers of each operation, which is critical for security, data integrity, and system protection. The standard “Fast Register Work Request” method is used, and careful implementation has shown that the overhead does not negatively impact performance, even for small I/O (4KB/operation). Check out Jose Barreto’s blog, which contains many benchmark results.

Q. But isn’t Live Migration done in 1MB “chunks”? So not “small” I/Os?

A. As answered later in the session, Hyper-V Live Migration is done in several phases, the first phase is the initial bulk copy of memory, done in large chunks, but immediately after it a second phase of copying individual pages which were dirtied by the live-running VM is performed. These operations are typically 4KB. Note: The faster the initial phase goes, the less work there is in this second phase, but in both phases, the faster, the better, and RDMA accelerates both.

Q. Are iSER and iWARP alternatives to one another?

A.  iWARP is an RDMA protocol, and iSER is a mapping of iSCSI to iWARP, as well as RoCE/InfiniBand.

Q. What’s Intel’s roadmap for RoCE and/or iWARP?

A.  Intel is committed to iWARP and plans to incorporate it in future server chipsets and SOCs. See http://www.intel.com/content/www/us/en/ethernet-products/accelerating-ethernet-iwarp-video.html for more information.

Q. Is there any other Transport being used other than IB to create a reliable transport for RoceV2? Puristically it is possible?

A. RoCE was developed to leverage Infiniband as much as possible.   For that reason, the Infiniband transport was chosen when the RoCE standard was developed.   As the RoCEv2 standard was developed, the underlying Infiniband network protocol was replaced with IPv4 / IPv6 in order to provide the layer 3 routability  and UDP to provide stateless encapsulation (and indication) of the Infiniband transport header that was retained.   While it may be possible to develop a reliable transport to replace Infiniband, the RoCE standards body has elected not to go that route  as of this writing.

 

 

 

Upcoming Plugfests at SDC

This year’s SNIA Storage Developer Conference (SDC) will take place in Santa Clara, CA Sept. 15-18.   In addition to an exciting agenda with great speakers, there is an opportunity for vendors to participate in SNIA Plugfests. Two Plugfests that I think are worth noting are: SMB2/SMB3 and iSCSI.

These Plugfests enable a vendor to bring their implementations of SMB2/SMB3 and/or iSCSI to test, identify, and fix bugs in a collaborative setting with the goal of providing a forum in which companies can develop interoperable products. SNIA provides and supports networks and infrastructure for the Plugfest, creating a collaborative framework for testing. Plugfest participants work together to define the testing process, assuring that objectives are accomplished.

Still Time to Register

Great news! There is still time to register. Setup for the Plugfest begins on September 13, 2014 and testing begins on the September 14th.

Register here for the SMB2/SMB3 Plugfest

Register here for the iSCSI Plugfest

What to Expect at a Plugfest

Learn more about what takes place at the Plugfests by watching the  video interview of Jeremy Allison, Co-Creator of Samba, as he candidly talks about what to expect at an SDC Plugfest.

Learn more about the Plugfest registration process. If you have additional questions, please contact Arnold Jones (arnold@snia.org).