We covered a lot of ground in out recent SNIA Ethernet Storage Forum webcast, “Current State of Storage in the Container World.” We had a technical discussion on why containers are so compelling, how Docker containers work, persistent shared storage and future considerations for container storage. We received some great questions during the live event, and as promised, here are answers to them all.
Q. Docker cannot be installed on bare metal and requires a base OS to operate upon right?
A. That is correct.
Q. Does the application code need to be changed so that it can “fit and operate” in a container?
A. No, the application code does not need to change. The challenge most people face when migrating an application to a container is how to maintain the application’s state. One of the motivations for this webcast was to explain how to allow applications within containers to persist data. Hopefully the Docker Volume construct will meet your needs.
Q. Seems like containers share one OS/kernel… That suggests that there is just one OS in the “containerized” server… And yet there is still mention of hypervisor (or at least Hyper-V)… Can you clarify? If the containers share an OS, is a hypervisor needed?
A. You are correct, containers are designed to share a single kernel; therefore a hypervisor is not required to run containers. Having said that, VMware and Microsoft both offer options that run a single container in its own virtual machine (running a minimal operating system).
Q. Can the Docker Hub be compared to something like the GitHub?
A. Yes, that is a great analogy. Docker Hub (hub.docker.com) is to container images as GitHub (github.com) is to source code.
Q. What are the differences between the base and the host image?
A. If you’re referring to the webcast slides; the box labeled “Base Image” is the first layer in an image. The box labeled “Host OS” is not a layer, but represents the hosting operating system (kernel) that is shared by the containers.
Q. So there is a separate root per container?
A. In most cases the image will provide a root, therefore each container will have a separate root. This is made possible by a kernel feature called namespaces. Alternatively, Docker does allow you to share a directory between the host operating system and any number of containers though.
Q. If Deduplication is enabled on the storage LUNs, won’t that affect the performance of the containers?
A. Well implemented data reduction features (compression and deduplication) should have little to no effect on performance and should provide significant benefit by reducing the space required to store containers.
Q. Can you please quickly review the concept of copy-on-write with one or two sentences to boil it down?
A. How the copy-on-write works depends on whether the driver is file or block based. For the sake of simplicity, let’s assume a file-based implementation. Since the image layers are read-only, we need an area to store the changes that the container has made. This area is the copy-on-write layer. When a process reads a file that has not been modified, the file is read from one of the read only layers. When that file is modified and needs to be written back to disk, the new file is written to the copy-on-write layer as is the metadata that describes the file. The next time this file is read, it is read from copy-on-write layer. The graph driver is responsible for this functionality and varies by implementation.
Q. Can network locations be used for /data? If yes, how does the Docker Engine manage network authentication for the driver?
A. Yes, network locations can be used. The best practice is to use the Local Volume Driver, where you can pass in the required authentication via the options (see slide 15). Alternatively, the network location can be mounted on the host operating system and exposed to containers (see slides 21 & 22).
Q. Is this where VAAI like primitives would get implemented?
A. VAAI defines several in-band primitives. The Docker Volume plug-in framework is completely out-of-band. There can be some overlap in features though. For example, the XCOPY primitive can be used to offload ‘copy jobs’ to an array. If the vendor chooses to do so, a ‘copy job’ can be offloaded through the Docker Volume plug-in as well. For example, a plug-in might implement a “clone” option that provides this service.
Q. Could you share some details about Kubernetes storage ? Persistent volumes and the difference from Docker volumes? Also, what is your perspective of Flocker?
A. Kubernetes has the concept of persistent storage. This abstraction is also called a volume. In addition, Kubernetes provides a plug-in option as well. The Kubernetes implementation predates the Docker Volume and is currently not compatible.
Q. Comment on mainframe: IBM runs Linux on zSeries, therefore can run Linux Docker containers.
A. Thanks, that’s good to know.
Q. How many operating systems changes on the x86 platform? How many on the mainframe platform? Can x86 architecture run the same code/OS from 40 years ago? Docker on mainframe?
A. The mainframe architecture has been very solid and consistent for many years.
Q. What is a big challenge for storage in container environment?
A. I don’t think storage has a challenge in the container environment. I think, with a properly implemented Docker Volume Plug-in, storage provides a solution to the persistent shared storage need in a container environment.
Q. Do you ever look into RexRay or VMDK storage drivers?
A. Yes, these are both examples of Docker Volume plug-in implementations.
Update: If you missed the live event, it’s now available on-demand. You can also download the webcast slides.