Why NFSv4.1 and pNFS are Better than NFSv3 Could Ever Be

NFSv4 has been a standard file sharing protocol since 2003, but has not been widely adopted; party because NFSv3 was “just good enough”. Yet, NFSv4 improves on NFSv3 in many important ways; and NFSv4.1 is a further improvement on that. In this post, I explain the how NFSv4.1 is better suited to a wide range of datacenter and HPC use than its predecessor NFSv3 and NFSv4, as well as providing resources for migrating from NFSv3 to NFSv4.1. And, most importantly, I make the argument that users should, at the very least, be evaluating and deploying NFSv4.1 for use in new projects; and ideally, should be using it wholesale in their existing environments.

The background to NFSv4.1
NFSv2 (specified in RFC-1813, but never an Internet standard) and its popular successor NFSv3 was first released in 1995 by Sun. NFSv3 has proved a popular and robust protocol over the 15 years it has been in use, and with wide adoption it soon eclipsed some of the early competitive UNIX-based filesystem protocols such as DFS and AFS. NFSv3 was extensively adopted by storage vendors and OS implementers beyond Sun’s Solaris; it was available on an extensive list of systems, including IBM’s AIX, HP’s HP-UX, Linux and FreeBSD. Even non-UNIX systems adopted NFSv3; Mac OS, OpenVMS, Microsoft Windows, Novell NetWare, and IBM’s AS/400 systems. In recognition of the advantages of interoperability and standardization, Sun relinquished control of future NFS standards work, and work leading to NFSv4 was by agreement between Sun and the Internet Society (ISOC), and is undertaken under the auspices of the Internet Engineering Task Force (IETF).

In April 2003, the Network File System (NFS) version 4 Protocol was ratified as an Internet standard, described in RFC-3530, which superseded NFSv3. This was the first open filesystem and networking protocol from the IETF. NFSv4 introduces the concept of state to ameliorate some of the less desirable features of NFSv3, and other enhancements to improved usability, management and performance.

But shortly following its release, an Internet draft written by Garth Gibson and Peter Corbett outlined several problems with NFSv4; specifically, that of limited bandwidth and scalability, since NFSv4 like NFSv3 requires that access is to a single server. NFSv4.1 (as described in RFC-5661, ratified in January 2010) was developed to overcome these limitations, and new features such as parallel NFS (pNFS) were standardized to address these issues.

Now NFSv4.2 is now moving towards ratification. In a change to the original IETF NFSv4 development work, where each revision took a significant amount of time to develop and ratify, the workgroup charter was modified to ensure that there would be no large standards documents that took years to develop, such as RFC-5661, and that additions to the standard would be an on-going yearly process. With these changes in the processes leading to standardization, features that will be ratified in NFSv4.2 (expected in early 2013) are available from many vendors and suppliers now.

Adoption of NFSv4.1
Every so often, I and others in the industry run Birds-of-a-Feather (BoFs) on the availability of NFSv4.1 clients and servers, and on the adoption of NFSv4.1 and pNFS. At our latest BoF at LISA ’12 in San Diego in December 2012, many of the attendees agreed; it’s time to move to NFSv4.1.

While there have been many advances and improvements to NFS, many users have elected to continue with NFSv3. NFSv4.1 is a mature and stable protocol with many advantages in its own right over its predecessors NFSv3 and NFSv2, yet adoption remains slow. Adequate for some purposes, NFSv3 is a familiar and well understood protocol; but with the demands being placed on storage by exponentially increasing data and compute growth, NFSv3 has become increasingly difficult to deploy and manage.

In essence, NFSv3 suffers from problems associated with statelessness. While some protocols such as HTTP and other RESTful APIs see benefit from not associating state with transactions – it considerably simplifies application development if no transaction from client to server depends on another transaction – in the NFS case, statelessness has led, amongst other downsides, to performance and lock management issues.

NFSv4.1 and parallel NFS (pNFS) address well-known NFSv3 “workarounds” that are used to obtain high bandwidth access; users that employ (usually very complicated) NFSv3 automounter maps and modify them to manage load balancing should find pNFS provides comparable performance that is significantly easier to manage.

So what’s the problem with NFSv3?
Extending the use of NFS across the WAN is difficult with NFSv3. Firewalls typically filter traffic based on well-known port numbers, but if the NFSv3 client is inside a firewalled network, and the server is outside the network, the firewall needs to know what ports the portmapper, mountd and nfsd servers are listening on. As a result of this promiscuous use of ports, the multiplicity of “moving parts” and a justifiable wariness on the part of network administrators to punch random holes through firewalls, NFSv3 is not practical to use in a WAN environment. By contrast, NFSv4 integrates many of these functions, and mandates that all traffic (now exclusively TCP) uses the single well-known port 2049.

Plus, NFSv3 is very chatty for WAN usage; and there may be many messages sent between the client and the server to undertake simple activities, such as finding, opening, reading and closing a file. NFSv4 can compound these operations into a single RPC (Remote Procedure Call) and reduce considerably the back-and-forth traffic across the network. The end result is reduced latency.

One of the most annoying NFSv3 “features” has been its handling of locks. Although NFSv3 is stateless, the essential addition of lock management (NLM) to prevent file corruption by competing clients means NFSv3 application recovery is slowed considerably. Very often stale locks have to be manually released, and the lock management is handled external to the protocol. NFSv4’s built-in lock leasing, lock timeouts, and client-server negotiation on recovery simplifies management considerably.

In a change from NFSv3, these locking and delegation features make NFSv4 stateful, but the simplicity of the original design is retained through well-defined recovery semantics in the face of client and server failures and network partitions. These are just some of the benefits that make NFSv4.1 desirable as a modern datacenter protocol, and for use in HPC, database and highly virtualized applications.
NFSv3 is extremely difficult to parallelise, and often takes some vendor-specific “pixie dust” to accomplish. In contrast, pNFS with NFSv4.1brings parallelization directly into the protocol; it allows many streams of data to multiple servers simultaneously, and it supports files as per usual, along with block and object support through an extensible layout mechanism. The management is definitely easier, as NFSv3 automounter maps and hand-created load-balancing schemes are eliminated and, by providing a standardized interface, pNFS ensures fewer issues in supporting multi-vendor NFS server environments.

Next post; the Advantages of NFSv4.1

FOOTNOTE: Parts of this blog were originally published in Usenix ;login: February 2012 under the title The Background to NFSv4.1. Used with permission.

Update: Want to learn more about NFS? Check out these SNIA ESF webcasts:

Leave a Reply