Docker’s rapid rise to prominence has put it on the radar of almost every technologist today, both IT professionals and developers alike. Docker containers hold the promise of providing the ideal packaging and deployment mechanism for microservices, a concept which has also experienced a growing surge in popularity.
But while the industry loves its sparkling new technologies, it is also deeply skeptical of them. Until a new technology has been battletested, it’s just an unproven idea with a hipster logo, so it’s not surprising that Docker is being evaluated with a critical eye—it should be.
To properly assess Docker’s utility, however, it’s necessary to follow container-based architecture to its logical conclusion. The benefits of isolation and portability, which get most of the attention, are reasons enough to adopt Docker containers. But the real game changer, I believe, is the deployment of containers in clusters. Container clusters managed by a framework like Google’s Kubernetes, allow for the true separation of application code and infrastructure, and enable highly resilient and elastic architectures
It is these three benefits in combination—isolation, portability, and container clustering—that are the real reasons why Docker represents such a significant evolution in how we build and deploy software. Containers further advance the paradigm shift in application development brought about by cloud computing by providing a higher layer of abstraction for application deployment, a concept we’ll explore in more detail later.
Is Docker worth it?
However, you don’t get the benefits of containers for free: Docker does add a layer of complexity to application development. The learning curve for Docker itself is relatively small, but it gets steeper when you add clustering. The question then becomes: is the juice worth the squeeze? That is, do containers provide enough tangible benefit to justify the additional complexity? Or are we just wasting our time following the latest fad?
Certainly, Docker is not the right solution for every project. The Ruby/Django/Laravel/NodeJS crowd will be the first to point out that their PaaS-ready frameworks already give them rapid development, continuous delivery and portability. Continuous integration platform provider Circle CI wrote a hilarious post poking fun at Docker, transcribing a fictitious conversation in which a Docker evangelist tries to explain the benefits of containers to a Ruby developer. The resulting confusion about the container ecosystem perfectly captures the situation.
Microsoft .NET developers also scratch their head when trying to understand why Docker is such a big deal. This is reasonable given the fact that Docker is currently a *nix-only technology. (This won’t be the case for long , however—Microsoft is making significant kernel changes to support Docker containers.) This confusion among .NET devs also makes sense because it’s just a heck of a lot easier to deploy a Windows application than an open source app. On Windows you don’t have multiple web servers that need to support multiple application frameworks that need to run on different operating systems, all of which have different versions that break everything built in prior versions. On a Microsoft stack you have a common, full-featured language (C#), a great, full-featured IDE (Visual Studio) and a single web/application server (IIS) that’s baked into the OS. Easy peasy. The fact that Docker has powerful server provisioning capabilities isn’t that interesting when you already get everything you need out-of-the-box.
So, if you’re a Windows developer working on a vanilla .NET application, Docker probably won’t provide much benefit (at least right now). Or, if you’re a Ruby/Django/Laravel/NodeJS developer building a relatively straightforward web app, and you can afford some downtime, and don’t anticipate much growth in the future, then an established application-framework is probably all you need.
However, many/most of us don’t operate in such simple environments. Even if we’re not Amazon, Facebook or Google, we have requirements that exceed what is possible with a framework alone. For this silent majority, Docker is an attractive option, especially for teams who:
- Iterate quickly and deploy frequently, but want to do so with minimal risk.
- Have apps that are too complex to be fully implemented by a basic MVC architecture.
- Want to write application code that is completely ignorant about the infrastructure on which it runs.
- Need to scale now or in the future.
- Want robust, fault-tolerant software.
- Need fast and frequent updates to handle rapid patching and security fixes.
- Cannot afford downtime, even for updates.
- Have adopted a microservices architecture.
- Are managing applications implemented in multiple technologies.
- Want no-hassle, error-free deployments between dev, staging and production.
- Want to minimize cloud provider costs by maximizing the utilization of each instance.
Benefit #1: Efficient Isolation
If you cross the Rubicon and make the decision to explore Docker containers, the first benefit you’ll find is process isolation. Each Docker container runs in its own isolated process space, which acts like its own mini-operating system. That means that each container can run its own OS distribution, install its own libraries and have its own configuration files. This eliminates a whole class of problems that arise when you’re building and deploying multiple services on a single server. The more stuff you have running on a single a machine, the more difficult it is to deploy, troubleshoot and maintain. Isolation minimizes these problems.
Now, you might be wondering: How does this differ from machine virtualization? Don’t VMs provide isolation too? And the answer is yes: both containers and VMs offer process isolation. The big difference is the layer used to provide the isolation. VMs provide isolation at the hypervisor level. This means that VMs run on a hypervisor that emulates a whole machine, including the hardware and entire operating system. Containers, in contrast, provide isolation at the operating system level. This means that containers run on an OS running Docker which emulates only the bits of the OS that matter to application developers, like the file system and process namespace.
The net effect of this difference is that VM-based isolation is more course-grain and resource intensive, while containers provide a more fine-grain, lightweight alternative. This difference becomes readily apparent when you deploy your application to the public cloud, where you’re paying by the hour for each VM instance. If you’re using VM-based isolation to segment your various application services, then you’ll have to deploy each component on a separate VM instance (which you’re paying for), even if you don’t need or want to. By contrast, if you isolate your application services using containers, then you can deploy the services on the same instance or separate instances as you see fit, in order to maximize performance and minimize cost. The lightweight nature of Docker containers makes them a more flexible and efficient way to achieve process isolation.
Benefit #2: Portability
The second big benefit of Docker is portability. Docker provides a system for building “hermetically sealed” containers that can run on any OS capable of running Docker. Docker helps achieve portability by making containers self-provisioning. When you develop your app for a Docker container, you use Docker’s provisioning capabilities to programmatically build your application “server”. Using the application’s
Dockerfile, you specify the libraries to install, configuration files to add, scripts to run, and the service to be started. This replaces more heavy-duty configuration management tools like Chef, Puppet, Ansible and Salt. Although you can still use these tools in conjunction with Docker to provision containers, the reality is you probably won’t. The power, simplicity and speed of provisioning containers natively in Docker eliminates the need for configuration management tools for many if not most use cases.
And once we have self-provisioned, isolated containers that can run anywhere, we just need a way to easily deploy them to achieve truly portability. This is where Docker Hub comes in. Docker Hub provides a highly available registry (which can be public or private) where we can push our containers to when they’re ready to deploy, just like we would push our source code to a Git repository. Then, in staging and production, all we have to do is pull down these containers from Docker Hub and run them. (In reality though, you’ll probably still use a CI server like Jenkins to move the application through the build pipeline, but Docker’s portability makes this process much more robust.)
Another benefit of running your application in containers is that it allows you to run a streamlined, purpose-built, Docker-focused distribution on the host machine in staging and production. With all of your OS-specific application configuration tucked away inside the container, the underlying host OS can be stripped down to the bare essentials. VMware Photon, RancherOS and CoreOS are a few of the more popular choices. CoreOS is arguably the most popular, in large part because it focuses on two critically important features: 1) it’s designed to self-update safely, which greatly improves security, and 2) it is designed from the ground-up to run Docker clusters, which brings us to the third and most profound benefit of Docker containers.
Benefit #3: Container Clusters
Efficient isolation and portability by themselves make containers very powerful. But the super power that makes containers truly evolutionary is their ability to run in clusters. A container cluster is a group of servers presented as a single logical unit for purposes of application deployment and management. We can deploy our containers to a cluster like Kubernetes and let the cluster itself determine which of the underlying servers will run them. If we scale the cluster up or down by adding or removing servers, it will automatically and intelligently resize the entire application. If a server in the cluster fails, it will take it offline and rebalance the load. A robust container cluster like Kubernetes can also handle advanced tasks like performing a rolling update, and it will even rollback the changes automatically in the event of a failure.
But best of all, we can enjoy all of these benefits with minimal or no change to our application code. Once our application is written to be containerized, then our code can be completely ignorant about the infrastructure on which it runs. We achieve this by taking our self-provisioned, hermetically sealed containers and combining them with configuration files that describe how the containers should be deployed. So rather than telling the cluster which containers should run on which hosts (an imperative approach), we specify the desired end-state and let Kubernetes schedule the services to run on individual hosts as it sees fit to achieve optimal performance (a declarative model). This gives us the ability to have some say in how our application should be deployed (e.g. web server containers should not be deployed on the same hosts as database containers), without forcing us to go down a rabbit hole of prescriptive container scheduling.
Clusters also provide immutable endpoints we can code to (e.g. a virtual endpoint for a database that our application needs to connect to), which eliminates the need for us to maintain the actual host names and IP addresses of the backend services we need to talk with. This type of service discovery functionality becomes essential in a clustered environment because we never know which actual hosts our services are running on.
The end result of these features is that container clusters create a new layer of abstraction in our application platform—the application cluster layer. Google engineer and Kubernetes founder Brendan Burns gave a great talk at CoreOS Fest that illustrates this evolution. He describes four layers in the modern technology stack:
|Application Layer||Developer||Application code, in whatever platform you code in + container provisioning script (i.e. the |
|Cluster Layer||DevOps||Container cluster like Google Kubernetes, Docker Swarm, CoreOS Fleet or Apache Mesos|
|Operating System Layer||Sys Admin||CoreOS, Rancher OS, VMware Photon and Windows Server 2016|
|Hardware Layer||Public cloud provider or Data Center Engineer||Public Cloud: AWS, Microsoft Azure, VMware vCloud Air. Internal Datacenter: Cisco, EMC, VMware vSphere, etc.|
By creating a Cluster Layer that separates the Application Layer from the Operating System Layer, application development can be decoupled from infrastructure concerns, in the same way that cloud virtualization abstracted hardware concerns from system administrators. Although I’m a firm believer that developers should have a strong working knowledge of the entire stack, they shouldn’t have to be cluster ninjas to build applications that are scalable and robust. The irony here is that Kubernetes—a technology developed by Google to run thousands of containers under intense load—makes it easier for the rest of us to build resilient architectures too. Even if you’re not running a 1000-node cluster, you should be looking at Kubernetes to help make your app more robust and efficient.
Docker is still in its infancy and container clusters are even newer, so it remains to be seen if the reality of containers will live up to its promise. Nonetheless, the direction of container-based architecture represents a significant evolution in software development. The benefits of isolation and portability are part of the story when it comes to understanding the power of containers, but the real reason they are such a big deal is their ability to run in clusters. Container clusters like Kubernetes enable mere mortals to design and maintain highly resilient and scalable applications by abstracting container deployment and management to the Cluster Layer. And although clustering itself is not new, the flexibility, control, granularity and breadth of technology supported by container clusters makes them a significant evolution in software development.