New Webcast: Cloud File Services: SMB/CIFS and NFS…in the Cloud

Imagine evaporating your existing file server into the cloud with the same experience and level of control that you currently have on-premises. On October 1st, ESF will host a live Webcast that introduces the concept of Cloud File Services and discusses the pros and cons you need to consider.

There are numerous companies and startups offering “sync & share” solutions across multiple devices and locations, but for the enterprise, caveats are everywhere. Register now for this Webcast to learn:

  • Key drivers for file storage
  • Split administration with sync & share solutions and on-premises file services
  • Applications over File Services on-premises (SMB 3, NFS 4.1)
  • Moving to the cloud: your storage OS in a hyperscalar or service provider
  • Accommodating existing File Services workloads with Cloud File Services
  • Accommodating  cloud-hosted applications over Cloud File Services

This Webcast will be a vendor-neutral and informative discussion on this hot topic. And since it’s live, we encourage your to bring your questions for our experts. I hope you’ll register today and we’ll look forward to having you attend on October 1st.    

 

Ethernet Meets Enterprise Storage – Finally

Presumptuous, yes, because Ethernet has been a mainstay in enterprises since its early days over 40 years ago.   It initially grew to prominence as the local area network (LAN) connection in the enterprise. More recent advances have enabled Ethernet to become a standard for mission critical storage connectivity for block, file and object storage in many enterprises.

Block storage in large enterprises has long been focused on Fibre Channel due to its performance capabilities.     In order to bring the same performance benefits to Ethernet, the IEEE 802.1 Data Center Bridging Task Group proposed a number of new standards to enhance Ethernet reliability.   For example, 802.1Qbb Priority-based Flow Control (PFC) provides a link level flow control mechanism to ensure lossless transmission under congestion, 802.1Qaz Enhanced Transmission Selection (ETS) provides a management framework for prioritized bandwidth and Data Center Bridging Exchange Protocol (DCBX) enabled these features to be used between neighbors to ensure consistency on the network. Collectively, these and other enhancements have brought those enterprise-class storage networking features to the Ethernet platform.

In addition, the International Committee for Information Technology Services (INCITS) T11 Fibre Channel committee developed a specification for Fibre Channel over Ethernet (FCoE) in its FC-BB-5 standard in 2009, which allows the Fibre Channel protocol to run directly on top of Ethernet, eliminating the TCP/IP stack and allowing for efficient performance of the Fibre Channel protocol.   FCoE also depends on the Data Center Bridging standards from IEEE 802.1 in order to ensure the “losslessness” and flow control needed by Fibre Channel.

An alternative to FCoE, iSCSI, was designed to run over standard Ethernet with TCP/IP and was designed to tolerate the “lossy” aspects of Ethernet.   Its architecture and the additional layers of encapsulation involved can impact latency and performance. However, more recent innovations in iSCSI have enabled it to run over a DCB Ethernet network, which enables iSCSI to inherit some of the enterprise storage features which have always been inherent in Fibre Channel.   For more on this, read last year’s blog  “How DCB Makes iSCSI Better ” from Allen Ordoubadian.

In 2013, INCITS submitted the FC-BB-6 standard for review which introduced, among other things, the VN2VN standard.   The VN2VN proposal will allow FCoE to work in a standard DCB switching environment without the presence of a Fibre Channel Forwarder (FCF).   An FCF allows for bridging between servers which are communicating with FCoE and storage devices which are communicating with traditional Fibre Channel.   As DCB switches and FCoE storage become more prevalent, the FC-BB-6 standard will allow for end-to-end FCoE connectivity in either a point to point (P2P) or DCB mesh environment. This will result in lower cost for FCoE environments. Products are beginning to appear which support VN2VN and over the next 18 months it is likely that all major vendors will support it. Check out our ESF Webcast “How VN2VN Will Help Accelerate Adoption of FCoE” for more details.

The availability of CNAs with processing capability allows for offloading storage protocol processing from the host processor, though some CNAs use host-based storage protocol initiators in system software and do selective stateless offloads in the data path.   Both FCoE and iSCSI require the storage protocol to be encapsulated in a frame to be sent across the Ethernet network.   In an enterprise environment, especially a virtual server environment, CPU utilization is tracked closely and target CPU thresholds are often set.   Anything which can minimize spikes in CPU utilization can allow for more workloads to be placed on servers and allows for predictable energy consumption.

For file storage, Ethernet has traditionally been the connectivity option of choice for file servers used as “shares” for centralized employee document storage. In the 21st century, usage of network attached storage (NAS) with the Network File System (NFS) has increased for enterprise databases and Hadoop clusters, especially with the availability of 10Gb Ethernet.   New features in NFS 4 and later introduced security and stateful protocol support after development of NFS was taken over by the Internet Engineering Task Force (IETF).

Object storage, has been around for nearly 20 years as a repository for storing data as objects which include not only the original file, but also a globally unique identifier and metadata which describes the object and various parameters about the object.   It has been used to store many forms of unstructured data, but found niches in certain areas, such as legal documents with retention policies and archiving photos and videos.   More recently, there seems to be a resurgence in object storage as the amount of unstructured data generated by enterprises continues to skyrocket.   Open source object storage in Ceph and OpenStack are also helping to drive the adoption. SNIA ESF is hosting a live Webcast on object storage on June 11, 2014, called “Object Storage 101.” I encourage you to register for this presentation for an unbiased look at the what, how and why of object storage technologies.

When combined with the advances in link speed, throughput capabilities, latency and input/output operations per second (IOPS) in modern 10Gb/s and 40Gb/s Ethernet, these existing and emerging Ethernet standards and storage architectures are having a profound effect on the ability of Ethernet as an enterprise class storage networking platform.   Vendors and customers are seeing the advantage in one wire, the Ethernet cable, carrying all LAN, WAN and storage traffic.

 

 

 

2013 in Review and the Outlook for 2014 – A SNIA ESF Perspective

Technology continues to advance rapidly. Making sense of it all can be a challenge. At the SNIA Ethernet Storage Forum, we focus on storage technologies and solutions enabled by and associated with Ethernet Networks. Last year, we modified the charters of our two Special Interest Groups (SIG) to address topics about file protocols and storage over Ethernet. The File Protocols SIG includes the prior focus on Network File System (NFS) related topics and adds discussions around Server Message Block (SMB / CIFS). We had our first webcast last November on the topic of SMB 3.0 and it was our best attended webcast ever. The Storage over Ethernet SIG focuses on general Ethernet storage topics as well as more information about technologies like FCoE, iSCSI, Data Center Bridging, and virtual networking for storage. I encourage you to check out other articles on these hot topics in this SNIAESF blog to hear from our member experts as well as guest posts from leading analysts.

2013 was a busy year and we are already kickin’ it in 2014. This should be an exciting year in IT. Data storage continues to be a hot sector especially in the areas of All-Flash and Hybrid arrays. This year, we will expect to see new standards coming out of the T11 committee for Fibre Channel and possibly FCoE as well as progress in high speed Ethernet networks. Lower cost network interconnects will facilitate adoption of high speed networks in the small to midsize business segment. And a new conversation around “Software Defined…” should push a lot of ink in trade rags and other news sources. Oh, and don’t forget about the “Internet of Things”, mobile solutions, and all things Cloud.

The ESF will be addressing the impact on Ethernet storage solutions from these hot technologies. Next month, on February 18th, experts from the ESF, along with industry analysts from Dell’Oro Group will speak to the benefits and best practices of deploying FCoE and iSCSI storage protocols. This presentation “Use Cases for iSCSI and Fibre Channel: Where Each Makes Sense” will be part of an upcoming BrightTalk Summit on Storage Networking. I encourage you to register for this session. Additionally, we will be publishing a couple of white papers on file-based storage and a review of FCoE and iSCSI in storage applications.

Finally, SNIA will be kicking off its first year of the new user conference, Data Storage Innovation Conference. This will be one of the few storage focused user conferences in the market and should be quite interesting.

We’re excited about our growing membership and our plans for 2014. Our goal is to advance  application of innovative technologies and we encourage you to send us mail or comment below with topics that are of interest to you.

Here’s to an exciting 2014!

SUSE Announces NFSv4.1 and pNFS Support

SUSE, founded in 1992, provides an enterprise ready Linux distribution in the form of SLES; the SUSE Linux Enterprise Server. As of late last month (October 22, 2013), SUSE announced that SLES 11 with service pack 3 now supports the Linux client for NFSv4.1 and pNFS client. This major distribution joins RedHat’s RHEL (RedHat Enterprise Linux) 6.4 in supporting enterprise quality Linux distributions with support for files based NFSv4.1 and pNFS.

For the adventurous, block and object pNFS support is available in the upstream kernel. Most regularly maintained distributions based on a Linux 3.1 or better kernel (if not all distributions now – check with the supplier of the distribution if you’re unsure) should provide the files, block and object compliant client directly in the download.

The future of pNFS looks very exciting. We now have a fully pNFS compliant Linux client, and a number of commercial files, blocks and object servers. Remember that although pNFS block and object support is available, currently these distributions support only the pNFS files layout. For those users not needing pNFS with block or objects support and requiring enterprise quality support, SUSE and RedHat are an excellent solution.

pNFS and Future NFSv4.2 Features

In this third and final blog post on NFS (see previous blog posts Why NFSv4.1 and pNFS are Better than NFSv3 Could Ever Be and The Advantages of NFSv4.1) I’ll cover pNFS (parallel NFS), an optional feature of NFSv4.1 that improves the bandwidth available for NFS protocol access, and some of the proposed features of NFSv4.2 – some of which are already implemented in commercially available servers, but will be standardized with the ratification of NFSv4.2 (for details, see the IETF NFSv4.2 draft documents).

Finally, I’ll point out where you can get NFSv4.1 clients with support for pNFS today.

Parallel NFS (pNFS) and Layouts

Parallel NFS (pNFS) represents a major step forward in the development of NFS. Ratified in January 2010 and described in RFC-5661, pNFS depends on the NFS client understanding how a clustered filesystem stripes and manages data. It’s not an attribute of the data, but an arrangement between the server and the client, so data can still be accessed via non-pNFS and other file access protocols.   pNFS benefits workloads with many small files, or very large files, especially those run on compute clusters requiring simultaneous, parallel access to data.

NFS 3 image 1

Clients request information about data layout from a Metadata Server (MDS), and get returned layouts that describe the location of the data. (Although often shown as separate, the MDS may or may not be standalone nodes in the storage system depending on a particular storage vendor’s hardware architecture.) The data may be on many data servers, and is accessed directly by the client over multiple paths. Layouts can be recalled by the server, as in the case for delegations, if there are multiple conflicting client requests.  

By allowing the aggregation of bandwidth, pNFS relieves performance issues that are associated with point-to-point connections. With pNFS, clients access data servers directly and in parallel, ensuring that no single storage node is a bottleneck. pNFS also ensures that data can be better load balanced to meet the needs of the client.

The pNFS specification also accommodates support for multiple layouts, defining the protocol used between clients and data servers. Currently, three layouts are specified; files as supported by NFSv4, objects based on the Object-based Storage Device Commands (OSD) standard (INCITS T10) approved in 2004, and block layouts (either FC or iSCSI access). The layout choice in any given architecture is expected to make a difference in performance and functionality. For example, pNFS object based implementations may perform RAID parity calculations in software on the client, to allow RAID performance to scale with the number of clients and to ensure end-to-end data integrity across the network to the data servers.

So although pNFS is new to the NFS standard, the experience of users with proprietary precursor protocols to pNFS shows that high bandwidth access to data with pNFS will be of considerable benefit.

Potential performance of pNFS is definitely superior to that of NFSv3 for similar configurations of storage, network and server. The management is definitely easier, as NFSv3 automounter maps and hand-created load balancing schemes are eliminated; and by providing a standardized interface, pNFS ensures fewer issues in supporting multi-vendor NFS server environments.

Some Proposed NFSv4.2 features

NFSv4.2 promises many features that end-users have been requesting, and that makes NFS more relevant as not only an “every day” protocol, but one that has application beyond the data center. As the requirements document for NFSv4.2 puts it, there are requirements for:  

  • High efficiency and utilization of resources such as, capacity, network bandwidth, and processors.
  • Solid state flash storage which promises faster throughput and lower latency than magnetic disk drives and lower cost than dynamic random access memory.

Server Side Copy

Server-Side Copy (SSC) removes one leg of a copy operation. Instead of reading entire files or even directories of files from one server through the client, and then writing them out to another, SSC permits the destination server to communicate directly to the source server without client involvement, and removes the limitations on server to client bandwidth and the possible congestion it may cause.

Application Data Blocks (ADB)

ADB allows definition of the format of a file; for example, a VM image or a database. This feature will allow initialization of data stores; a single operation from the client can create a 300GB database or a VM image on the server.

Guaranteed Space Reservation & Hole Punching

As storage demands continue to increase, various efficiency techniques can be employed to give the appearance of a large virtual pool of storage on a much smaller storage system.   Thin provisioning, (where space appears available and reserved, but is not committed) is commonplace, but often problematic to manage in fast growing environments. The guaranteed space reservation feature in NFSv4.2 will ensure that, regardless of the thin provisioning policies, individual files will always have space available for their maximum extent.

NFS 3 image 2

While such guarantees are a reassurance for the end-user, they don’t help the storage administrator in his or her desire to fully utilize all his available storage. In support of better storage efficiencies, NFSv4.2 will introduce support for sparse files. Commonly called “hole punching”, deleted and unused parts of files are returned to the storage system’s free space pool.

Obtaining Servers and Clients

With this background on the features of NFS, there is considerable interest in the end-user community for NFSv4.1 support from both servers and clients. Many Network Attached Storage (NAS) vendors now support NFSv4, and in the last 12 months, there has been a flurry of activity and many developments in server support of NFSv4.1 and pNFS.

For NFS server vendors, there are NFSv4.1 and files based, block based and object based implementations of pNFS available; refer to the vendor websites, where you will get the latest up-to-date information.

On the client side, there is RedHat Enterprise Linux 6.4 that includes full support for NFSv4.1 and pNFS (see www.redhat.com), Novell SUSE Linux Enterprise Server 11 SP2 with NFSv4.1 and pNFS based on the 3.0 Linux kernel (see www.suse.com), and Fedora available at fedoraproject.org.

Conclusion          

NFSv4.1 includes features intended to enable its use in global wide area networks (WANs).   These advantages include:

  • Firewall-friendly single port operations
  • Advanced and aggressive cache management features
  • Internationalization support
  • Replication and migration facilities
  • Optional cryptography quality security, with access control facilities that are compatible across UNIX ® and Windows ®
  • Support for parallelism and data striping

The goal for NFSv4.1 and beyond is to define how you get to storage, not what your storage looks like. That has meant inevitable changes. Unlike earlier versions of NFS, the NFSv4 protocol integrates file locking, strong security, operation coalescing, and delegation capabilities to enhance client performance for data sharing applications on high-bandwidth networks.

NFSv4.1 servers and clients provide even more functionality such as wide striping of data to enhance performance.   NFSv4.2 and beyond promise further enhancements to the standard that increase its applicability to today’s application requirements. It is due to be ratified in August 2012, and we can expect to see server and client implementations that provide NFSv4.2 features soon after this; in some cases, the features are already being shipped now as vendor specific enhancements.  

With careful planning, migration to NFSv4.1 (and NFSv4.2 when it becomes generally available) from prior versions can be accomplished without modification to applications or the supporting operational infrastructure, for a wide range of applications; home directories, HPC storage servers, backup jobs and a variety of other applications.

 

FOOTNOTE: Parts of this blog were originally published in Usenix ;login: February 2012 under the title The Background to NFSv4.1. Used with permission.

Update: Want to learn more about NFS? Check out these SNIA ESF webcasts:

 

Red Hat adds commercial support for pNFS

Red Hat Enterprise Linux shipped their first commercially supported parallel NFS client on February 21st. The Red Hat ecosystem can deploy pNFS with the confidence of engineering, test, and long-term support of the industry standard protocol.

 

Red Hat Engineering has been working with the upstream community and several SNIA ESF member companies to backport code and test interoperability with RHEL6. This release supports all IO functions in pNFS, including Direct IO. Direct IO support is required for KVM virtualization, as well as to support the leading databases. Shared workloads and Big Data have performance and capacity requirements that scale unpredictably with business needs.

 

Parallel NFS (pNFS) enables scaling out NFS to improve performance, manage capacity and reduce complexity.   An IETF standard storage protocol, pNFS can deliver parallelized IO from a scale-out NFS array and uses out-of-band metadata services to deliver high-throughput solutions that are truly industry standard. SNIA ESF has published several papers and a webinar specifically focused on pNFS architecture and benefits. They can be found on the SNIA ESF Knowledge Center.

Where are you in your deployment of pNFS

View Results

Loading ... Loading ...

The Advantages of NFSv4.1

In a previous blog post Why NFSv4.1 and pNFS are Better than NFSv3 Could Ever Be, some of the issues with NFSv3 that made it difficult to implement as a WAN based or data center wide protocol were discussed. The question then becomes; why not move to NFSv4 instead of NFSv4.1? Isn’t that a bigger leap from NFSv3?

Well, practical experience and some issues with NFSv4 made NFSv4.1 a necessity; for one, it introduces the key concept of sessions, and provides a foundation for pNFS (parallel NFS) which we’ll discuss in a later blog post. And all the features of NFSv4 were carried over into NFSv4.1, since it was a minor version update; there’s little more to do to take advantage of NFSv4.1, so that’s where your focus evaluation and implementation should be.

TCP for Transport

NFSv3 supports both TCP (Transmission Control Protocol) and UDP (User Datagram Protocol), and UDP is sometimes employed (for those applications that support it) because it is perceived to be lightweight and faster in comparison to TCP.

The downside of UDP is that it’s connectionless (that is, stateless) and an unreliable protocol. There is no guarantee that the datagrams will be delivered in any given order to the destination host — or even delivered at all — so applications must be specifically designed to handle missing, duplicate or incorrectly ordered data. UDP is also not a good network citizen; there is no concept of congestion or flow control, and no ability to apply quality of service (QoS) criteria.

The NFSv4 specification requires that any transport used provides congestion control. The easiest way to do this is via TCP. By using TCP, NFSv4 clients and servers are able to adapt to known frequent spikes in unreliability on the Internet; and retransmission is managed in the transport layer instead of in the application layer, greatly simplifying applications and their management on a shared network.

NFSv4 also introduces strict rules about retries over TCP in contrast to the complete lack of rules in NFSv3 for retries over TCP. As a result, if NFSv3 clients have timeouts that are too short, NFSv3 servers may drop requests. NFSv4 uses the timers that are built into the connection-oriented transport.

Network Ports

To access an NFS server, an NFSv3 client must contact the server’s portmapper to find the port of the mountd server. It then contacts the mount server to get an initial file handle, and again contacts the portmapper to get the port of the NFS server. Finally, the client can access the NFS server.

This creates problems for using NFS through firewalls, because firewalls typically filter traffic based on well-known port numbers. If the client is inside a firewalled network, and the server is outside the network, the firewall needs to know what ports the portmapper, mountd and nfsd servers are listening on. The mount server can listen on any port, so telling the firewall what port to permit is not practical. While the NFS server usually listens on port 2049, sometimes it does not. While the portmapper always listens on the same port (111), many firewall administrators, out of excessive caution, block requests to port 111 from inside the firewalled network to servers outside the network. As a result, NFSv3 is not practical to use through firewalls. (Aside from which, without security, it’s risky too.)

NFSv4 uses a single port number by mandating the server will listen on port 2049. There are no “auxiliary” protocols like statd, lockd and mountd required as the mounting and locking protocols have been incorporated into the NFSv4 protocol. This means that NFSv4 clients do not need to contact the portmapper, and do not need to access services on floating ports.

As NFSv4 uses a single TCP connection with a well-defined destination TCP port, it traverses firewalls and network address translation (NAT) devices with ease, and makes firewall configuration as simple as configuration for HTTP servers.

Mounts and Automounter

The automounter daemons and the utilities on different flavors of UNIX and Linux are capable of identifying different NFS versions. However, using the automounter will require at least port 111 to be permitted through any firewall between server and client, as it uses the portmapper.

This is undesirable if you are extending the use of NFSv4 beyond traditional NFSv3 environments, so in preference the widely available “mirror mount” facility can be used. It enhances the behavior of the NFSv4 client by creating a new mountpoint whenever it detects that a directory’s fsid differs from that of its parent and automatically mounts filesystems when they are encountered at the NFSv4 server .

This enhancement does not require the use of the automounter and therefore does not rely on the content or propagation of automounter maps, the availability of NFSv3 services such as mountd, or opening firewall ports beyond the single port 2049 required for NFSv4.

Internationalization Support; UTF-8

Yes, those funny characters outside of US-ASCII are supported. In a welcome recognition that it set no longer provides the descriptive capabilities demanded by languages with larger alphabets or those that use an extensive range of non-Roman glyphs, NFSv4 uses UTF-8 for file names, directories, symlinks and user and group identifiers. As UTF-8 is backwards compatible with 7 bit encoded ASCII, any names that are 7 bit ASCII will continue to work.

Compound RPCs

Latency in a wide area network (WAN) is a perennial issue, and is very often measured in tenths of a second to seconds. NFS uses Remote Procedure Calls (RPCs) to undertake all its communication with the server, and although the payload is normally small, meta-data operations are largely synchronous and serialized. Operations such as file lookup (LOOKUP), the fetching of attributes (GETATTR) and so on, make up the largest percentage by count of the average traffic load on NFS.

This mix of a typical NFS set of RPC calls in versions prior to NFSv4 requires each RPC call is a separate transaction over the wire. NFSv4 avoids the expense of single RPC requests and the attendant latency issues and allows these calls to be bundled together. For instance, a lookup, open, read and close can be sent once over the wire, and the server can execute the entire compound call as a single entity. The effect is to reduce latency considerably for multiple operations.

Delegations

Servers are employing ever more quantities of RAM and flash technologies, and very large caches in the orders of terabytes are not uncommon. Applications running over NFSv3 can’t take advantage of these caches unless they have specific application support. With increasing WAN latencies doing every IO over the wire introduces significant delay.

NFSv4 allows the server to delegate certain responsibilities to the client, a feature that allows caching locally where the data is being accessed. Once delegated, the client can act on the file locally with the guarantee that no other client has a conflicting need for the file; it allows the application to have locking, reading and writing requests serviced on the application server without any further communication with the NFS server. To prevent deadlocking conditions, the server can recall the delegation via an asynchronous callback to the client should there be a conflicting request for access to the file from a different client.

Migration, Replicas and Referrals

For broader use within a datacenter, and in support of high availability applications such as databases and virtual environments, copying data for backup and disaster recovery purposes, or the ability to migrate it to provide VM location independence are essential. NFSv4 provides facilities for both transparent replication and migration of data, and the client is responsible for ensuring that the application is unaware of these activities. An NFSv4 referral allows servers to redirect clients from this server’s namespace to another server; it allows the building of a global namespace while maintaining the data on discrete and separate servers.

Sessions

Perhaps one of the most significant features of NFSv4.1 is the introduction of stateful sessions. Sessions bring the advantages of correctness and simplicity to NFS semantics. In order to improve on the correctness of NFSv4, NFSv4.1 sessions introduce “exactly-once” semantics.
Servers maintain one or more session states in agreement with the client; they maintain the server’s state relative to the connections belonging to a client. Clients can be assured that their requests to the server have been executed, and that they will never be executed more than once.

Sessions extend the idea of NFSv4 delegations, which introduced server-initiated asynchronous callbacks; clients can initiate session requests for connections to the server. For WAN based systems, this simplifies operations through firewalls.

Security

An area of great confusion, many believe that NFSv4 requires the use of strong security. The NFSv4 specification simply states that implementation of strong RPC security by servers and clients is mandatory, not the use of strong RPC security. This misunderstanding may explain the reluctance of users from migrating to NFSv4 due to the additional work in implementing or modifying their existing Kerberos security.

Security is increasingly important as NFSv4 makes data more easily available over the WAN. This feature was considered so important by the IETF NFS working group that the security specification using Kerberos v5 was “retrofitted” to the NFSv2 and NFSv3 specifications.

Graphic for Advantages of NFS

Although access to an NFS filesystem without strong security such as provided by Kerberos is possible, across a WAN it should really be considered only as a temporary measure. In that spirit, it should be noted that NFSv4 can be used without implementing Kerberos security. The fact that it is possible does not make it desirable! A fuller description of the issues and some migration considerations can be found in the SNIA White Paper “Migrating from NFSv3 to NFSv4”.

Many of the practical issues faced in implementing robust Kerberos security in a UNIX environment can be eased by using a Windows Active Directory (AD) system. Windows uses the standard Kerberos protocol as specified in RFC 1510; AD user accounts are represented to Kerberos in the same way as accounts in UNIX realms. This can be a very attractive solution in mixed-mode environments.

In the next post, we’ll discuss one of the primary features of NFSv4.1; pNFS, or parallelized NFS, and some of the new work being done in support of NFSv4.2.
FOOTNOTE: Parts of this blog were originally published in Usenix ;login: February 2012 under the title The Background to NFSv4.1. Used with permission.

Update: Want to learn more about NFS? Check out these SNIA ESF webcasts:

Object Storage is a Big Deal (and Ethernet Matters)

A significant challenge in managing large amounts of data (or Big Data) is a lack of what I like to call “total data awareness”. It’s a situation where you know (or suspect) that you have data – you just can’t find it. When you think about many current IT environments, they are often not built for total data awareness. This starts with core elements of the IT infrastructure, such as file systems. Traditional file systems and access methods were not designed to store hundreds of millions or billions of files in a single namespace. This leads to admins storing data in multiple file systems, multiple shares, complex directory structures – not because the data should be logically organized in that way, but simply because of limitations in file system architectures. This issue becomes even more pressing when data sits in multiple locations, maybe even across on-premise and off-premise, cloud-based storage.

Is object-based storage the answer?

Think about how you find data on your computer. Do you navigate complex directory structures, trying to remember the file name of the file that hopefully has the data you are looking for – or have you moved on and just use search tools like Spotlight? Imagine you have hundreds of millions of files, scattered across dozens or hundreds of sites. How about just searching across these sites and immediately finding the data you are looking for? With object storage technology you have the ability to store data in objects, along with metadata that describes the object. Now you can just search for your data based on metadata tags (like a filename – or even better an account number and document type) – as well as manage data based on policies that leverage that metadata.

However, this often means that you have to consider interfacing with your storage system through APIs, as opposed to NFS and CIFS – so your applications need to support whatever API your storage vendor offers.

CDMI to the rescue?

Today, storage vendors often use proprietary APIs. This means that application vendors would have to support a plethora of APIs from a number of different vendors, leading to a lack of commitment from application vendors to support more innovative, object-based storage architectures.

A key path to solve this issue is to leverage technology and standards that have been specifically developed to provide this idea of a single namespace for billions of data sets and across locations and even managed services that might reside off-premise.

Relatively new on the standards side you have CDMI (http://www.snia.org/cdmi), the Cloud Data Management Interface. CDMI is a standard developed by SNIA (http://www.snia.org), the Storage Networking Industry Association, with heavy involvement from a number of leading storage vendors. CDMI not only introduces a standard interface to ingest and retrieve data into and out of a large-scale repository, it also enables applications to easily manage this repository and where the data sits.

CDMI is the new NFS

Forgive the provocation, but when it comes to creating and managing large, distributed content repositories it quickly becomes clear that NFS and CIFS are not ideally suited for this use case. This is where CDMI shines, especially with an object-based storage architecture behind it that was built to support multi-petabyte environments with billions of data sets across hundreds of sites and accommodates retention policies that can reach to “forever”.

CDMI and NFS have something in common – Ethernet

One of the key commonalities between CDMI and NFS is that they both are ideally suited to be deployed in an Ethernet infrastructure. CDMI, specifically, is a RESTful HTTP interface, so it runs on standard Ethernet networks. Even for object storage deployments that don’t support CDMI, practically all of these multi-site, long-term repositories support HTTP (and thus Ethernet) through proprietary APIs based on REST or SOAP.

Why does this matter

Ethernet infrastructure is a great foundation to run any number of workloads, including access to data that sits in large, multi-site content repositories that are based on object storage technologies. So if you are looking at object storage, chances are that you will be able to leverage existing Ethernet infrastructure.

Ethernet Storage Forum – 2012 Year in Review and What to Expect in 2013

As we come to a close of the year 2012, I want to share some of our successes and briefly highlight some new changes for 2013. Calendar year 2012 has been eventful and the SNIA-ESF has been busy. Here are some of our accomplishments:

  • 10GbE – With virtualization and network convergence, as well as the general availability of LOM and 10GBASE-T cabling, we saw this is a “breakout year” for 10GbE. In July, we published a comprehensive white paper titled “10GbE Comes of Age.” We then followed up with a Webcast “10GbE – Key Trends, Predictions and Drivers.” We ran this live once in the U.S. and once in the U.K. and combined, the Webcast has been viewed by over 400 people!
  • NFS – has also been a hot topic. In June we published a white paper “An Overview of NFSv4″ highlighting the many improved features NFSv4 has over NFSv3. A Webcast to help users upgrade, “NFSv4 – Plan for a Smooth Migration,” has also been well received with over 150 viewers to date.   A 4-part Webcast series on NFS is now planned. We kicked the series off last month with “Reasons to Start Working with NFSv4 Now” and will continue on this topic during the early part of 2013. Our next NFS Webcast will be “Advances in NFS – NFSv4.1 and pNFS.” You can register for that here.
  • Flash – The availability of solid state devices based on NAND flash is changing the performance efficiencies of storage. Our September Webcast “Flash – Plan for the Disruption” discusses how Flash is driving the need for 10GbE and has already been viewed by more than 150 people.

We have also added to expand membership and welcome new membership from Tonian and LSI to the ESF. We expect with this new charter to see an increase in membership participation as we drive incremental value and establish ourselves as a leadership voice for Ethernet Storage.

As we move into 2013, we expect two hot trends to continue – the broader use of file protocols in datacenter applications, and the continued push toward datacenter consolidation with the use of Ethernet as a storage network. In order to better address these two trends, we have modified our charter for 2013. Our NFS SIG will be renamed the File Protocol SIG and will focus on promoting not only NFS, but also SMB / CIFS solutions and protocols. The iSCSI SIG will be renamed to the Storage over Ethernet SIG and will focus on promoting data center convergence topics with Ethernet networks, including the use of block and file protocols, such as NFS, SMB, FCoE, and iSCSI, over the same wire. This modified charter will allow us to have a richer conversation around storage trends relevant to your IT environment.

So, here is to a successful 2012, and excitement for the coming year.

Why NFSv4.1 and pNFS are Better than NFSv3 Could Ever Be

NFSv4 has been a standard file sharing protocol since 2003, but has not been widely adopted; party because NFSv3 was “just good enough”. Yet, NFSv4 improves on NFSv3 in many important ways; and NFSv4.1 is a further improvement on that. In this post, I explain the how NFSv4.1 is better suited to a wide range of datacenter and HPC use than its predecessor NFSv3 and NFSv4, as well as providing resources for migrating from NFSv3 to NFSv4.1. And, most importantly, I make the argument that users should, at the very least, be evaluating and deploying NFSv4.1 for use in new projects; and ideally, should be using it wholesale in their existing environments.

The background to NFSv4.1
NFSv2 (specified in RFC-1813, but never an Internet standard) and its popular successor NFSv3 was first released in 1995 by Sun. NFSv3 has proved a popular and robust protocol over the 15 years it has been in use, and with wide adoption it soon eclipsed some of the early competitive UNIX-based filesystem protocols such as DFS and AFS. NFSv3 was extensively adopted by storage vendors and OS implementers beyond Sun’s Solaris; it was available on an extensive list of systems, including IBM’s AIX, HP’s HP-UX, Linux and FreeBSD. Even non-UNIX systems adopted NFSv3; Mac OS, OpenVMS, Microsoft Windows, Novell NetWare, and IBM’s AS/400 systems. In recognition of the advantages of interoperability and standardization, Sun relinquished control of future NFS standards work, and work leading to NFSv4 was by agreement between Sun and the Internet Society (ISOC), and is undertaken under the auspices of the Internet Engineering Task Force (IETF).

In April 2003, the Network File System (NFS) version 4 Protocol was ratified as an Internet standard, described in RFC-3530, which superseded NFSv3. This was the first open filesystem and networking protocol from the IETF. NFSv4 introduces the concept of state to ameliorate some of the less desirable features of NFSv3, and other enhancements to improved usability, management and performance.

But shortly following its release, an Internet draft written by Garth Gibson and Peter Corbett outlined several problems with NFSv4; specifically, that of limited bandwidth and scalability, since NFSv4 like NFSv3 requires that access is to a single server. NFSv4.1 (as described in RFC-5661, ratified in January 2010) was developed to overcome these limitations, and new features such as parallel NFS (pNFS) were standardized to address these issues.

Now NFSv4.2 is now moving towards ratification. In a change to the original IETF NFSv4 development work, where each revision took a significant amount of time to develop and ratify, the workgroup charter was modified to ensure that there would be no large standards documents that took years to develop, such as RFC-5661, and that additions to the standard would be an on-going yearly process. With these changes in the processes leading to standardization, features that will be ratified in NFSv4.2 (expected in early 2013) are available from many vendors and suppliers now.

Adoption of NFSv4.1
Every so often, I and others in the industry run Birds-of-a-Feather (BoFs) on the availability of NFSv4.1 clients and servers, and on the adoption of NFSv4.1 and pNFS. At our latest BoF at LISA ’12 in San Diego in December 2012, many of the attendees agreed; it’s time to move to NFSv4.1.

While there have been many advances and improvements to NFS, many users have elected to continue with NFSv3. NFSv4.1 is a mature and stable protocol with many advantages in its own right over its predecessors NFSv3 and NFSv2, yet adoption remains slow. Adequate for some purposes, NFSv3 is a familiar and well understood protocol; but with the demands being placed on storage by exponentially increasing data and compute growth, NFSv3 has become increasingly difficult to deploy and manage.

In essence, NFSv3 suffers from problems associated with statelessness. While some protocols such as HTTP and other RESTful APIs see benefit from not associating state with transactions – it considerably simplifies application development if no transaction from client to server depends on another transaction – in the NFS case, statelessness has led, amongst other downsides, to performance and lock management issues.

NFSv4.1 and parallel NFS (pNFS) address well-known NFSv3 “workarounds” that are used to obtain high bandwidth access; users that employ (usually very complicated) NFSv3 automounter maps and modify them to manage load balancing should find pNFS provides comparable performance that is significantly easier to manage.

So what’s the problem with NFSv3?
Extending the use of NFS across the WAN is difficult with NFSv3. Firewalls typically filter traffic based on well-known port numbers, but if the NFSv3 client is inside a firewalled network, and the server is outside the network, the firewall needs to know what ports the portmapper, mountd and nfsd servers are listening on. As a result of this promiscuous use of ports, the multiplicity of “moving parts” and a justifiable wariness on the part of network administrators to punch random holes through firewalls, NFSv3 is not practical to use in a WAN environment. By contrast, NFSv4 integrates many of these functions, and mandates that all traffic (now exclusively TCP) uses the single well-known port 2049.


Plus, NFSv3 is very chatty for WAN usage; and there may be many messages sent between the client and the server to undertake simple activities, such as finding, opening, reading and closing a file. NFSv4 can compound these operations into a single RPC (Remote Procedure Call) and reduce considerably the back-and-forth traffic across the network. The end result is reduced latency.

One of the most annoying NFSv3 “features” has been its handling of locks. Although NFSv3 is stateless, the essential addition of lock management (NLM) to prevent file corruption by competing clients means NFSv3 application recovery is slowed considerably. Very often stale locks have to be manually released, and the lock management is handled external to the protocol. NFSv4’s built-in lock leasing, lock timeouts, and client-server negotiation on recovery simplifies management considerably.

In a change from NFSv3, these locking and delegation features make NFSv4 stateful, but the simplicity of the original design is retained through well-defined recovery semantics in the face of client and server failures and network partitions. These are just some of the benefits that make NFSv4.1 desirable as a modern datacenter protocol, and for use in HPC, database and highly virtualized applications.
NFSv3 is extremely difficult to parallelise, and often takes some vendor-specific “pixie dust” to accomplish. In contrast, pNFS with NFSv4.1brings parallelization directly into the protocol; it allows many streams of data to multiple servers simultaneously, and it supports files as per usual, along with block and object support through an extensible layout mechanism. The management is definitely easier, as NFSv3 automounter maps and hand-created load-balancing schemes are eliminated and, by providing a standardized interface, pNFS ensures fewer issues in supporting multi-vendor NFS server environments.

Next post; the Advantages of NFSv4.1

FOOTNOTE: Parts of this blog were originally published in Usenix ;login: February 2012 under the title The Background to NFSv4.1. Used with permission.

Update: Want to learn more about NFS? Check out these SNIA ESF webcasts: