One of the biggest challenges faced in ADM world has always been the environment specific dependencies. An application that works perfectly fine in one environment, tested and QA-approved in one environment, suddenly starts behaving differently when promoted to a higher environment. When an application has to be “shipped” from one environment to another, the process usually involves creating a binary distributable of the application which can then be deployed in the desired environment. However, these packaged “binary distributables” still have dependencies with the environment that they run in – the specific application server’s runtime, external libraries or even the operating system.
In a more traditional IT enterprise, the development team has only control over the application binary where as all the other “dependencies” are controlled by operations team – which further make it a suspense as to whether the same application binary will work equally well when migrated from one environment to another.
In an ideal world, we’d want these application binaries to have all their dependencies also bundled along with into their distributables so as to eliminate any dependencies on the “external environment”. However doing that would mean, each application’s binary distributable basically includes everything, from OS, to app server, to libraries all packaged together. Which is what “virtualization” is all about. You have ONE single distributable with minimal “external dependencies” which when launched on a piece of hardware, will bring up the application as a whole – which is guaranteed to run equally well as in the earlier environments. This ONE single distributable is what we call a Virtual Machine (VM) image. But such virtualization at application level would mean massive footprints for each application and too much of redundancy leading to excessive need for resources to run the redundant services.
That’s where the containers come to the rescue. Simply put, Containers allow you to package your applications into standardized units for software development that have no dependency on the “environment.” Although the concept is similar to virualization, but abstracted at a different level.
To understand this better, lets first take a closer look at the internals of server virtualization. In the case of virualization, each Virtual Machine (VM) wraps within itself, the entire host machine (except for the hardware). The VM includes the OS, the drivers, the libraries, the application server and also the actual application. Multiple such VMs (often referred to as “guests”) run on a single “physical” server which itself has its own operating system (referred to as the “host”). The management of these guest VMs and utilization of the host’s resources is all managed by an intermediate layer called “Hypervisor”.
It is the job of the hypervisor to provide isolation of system resources – across the various VMs running within it.
Containers on the other hand, do not have their own Operating System. Multiple container instances running on a VM, use the same host operating system. This approach greatly reduces the footprint of each application and also consumes fewer resources at runtime, because each container only holds the application and related binaries or libraries. Also, setting up a container now no longer involves installing (and possibly paying for the license of) an Operating System for each instance. Just as hypervisor manages resources across VMs running on a server, in this case, the role of a hypervisor is instead handled by a containerization engine, like Docker, which installs on top of the host operating system. The container engines ensure that even though the containers share the same OS, each container is still very much isolated from the other.
There is a flipside to it though. A VM runs its own OS, while a Container doesn’t have its own OS, but shares the host OS. This means, you can runs VMs with any guest OS (Windows, Solaris, FreeBSD, etc.) over Linux Kernel, where as you can only run Linux Containers in a Linux Kernel. (Containers have a dependency on the host OS)
Moreover, there could also be security issues with 2 chatty applications running in containers hosted on the same server. Any vulnerability in one application, can make all other containerized applications running on that same server vulnerable. Although the container engine provides isolation to some extent – it is not as securely isolated as VMs due to a shared OS.
Since each application’s container is free of OS overhead, the container is notably smaller, easier to migrate or download, faster to backup or restore and requires less memory. Containerization allows the server to potentially host far more containers than it could host virtual machines. The difference in utilization can be dramatic, and it is possible to fit anywhere from 10 to 100 times the number of container instances on a given server (compared to the number of VM-based application instances).
So why are containers suddenly becoming so popular now?
As discussed in the previous section, Containerization has definite advantages over Virtualization – consumes far less resources, startup much faster and also requires less licensing cost (OS).
However it also comes with a tradeoff with benefits like security and complete isolation of application – which is far more mature in Virtualization.
Although business cannot ignore security and stability, the significant advantages offered by containers cannot be overlooked – and Business is continuously looking for the best of both worlds, but how?
The answer lies in the larger IT ecosystem and trends which include Cloud, Microservices, DevOps and automation! Yes, they are all related and in order to fully leverage the potential of one, it is imperative to enable all these four areas within your organization.
Despite the buzz about cloud computing, only a fraction of organizations have actually implemented Enterprise Cloud. Most companies have consolidated their datacenters and implemented server virtualization, but that is really only the first step towards cloud migration. To be able to leverage the true potential of Cloud, one still needs the ability to automate scaling at each component/module level and that means two things –
- Organizations need to be able to break their monoliths into smaller independently deployable packages (a.k.a microservices)
- Organizations need highly mature processes and toolchains to allow full automation of the deployment of these individual modules (a.k.a. DevOps)
Now, Containers provide the ability to break free the interdependencies between modules/applications and the environments they run in, however they have an issue with isolation between containers running on the same host.
Microservices architecture enables isolation between modules by creating independently deployable packages.
Combining both the worlds – Decoupled non-chatty modules deployed as individual containerized microservices, do not talk to each other and hence architecturally, they are isolated applications. Which means these microservices when deployed on containers, will circumvent the disadvantage of containers pertaining to security and isolation – the same being addressed at the application layer by leveraging the microservice architecture – hence making containers now more and more popular.
Moreover, DevOps provides mature tools and processes that help automate the container orchestration.
Runtime deployables for each module, packaged within the container – which contains not only the application binaries, but also all the dependant libraries and possibly even the application server. These are versioned and maintained in a separate container repository. If one of the application instance fails, a container orchestration engine picks this and using the automation scripts, we can immediately check out the specific container version from the repository corresponding to the failed instance, and start up the same to bring up another instance of the application to replace the failed instance – hence giving optimal agility and auto-Healing features.
Which is why, for organizations that are looking to plunge into the Cloud Transformation journey, Leveraging the combination of microservices, DevOps and containers is not optional – but more a mandate.
Everything at Google runs in a container. Google starts over 2 billion containers per week. — Joe Beda, Google Cloud Platform
So far we have only been focusing on the runtime benefits of containerization. However this could as well be leveraged during build phase.
When we talk about microservices, we also talk about a polyglot architecture where each microservice can be developed on its own technology stack. The challenge then would be to manage different development environment and workspaces for each microservice. This again can be resolved using separate containers that are pre-configured with the development environment and workspaces for each microservice, so that a developer simply checks out this module-specific container from the repository, starts it up, and has the entire development setup ready.
Knowing the Container Ecosystem
Now that we have discussed the concepts of containerization & virtualization, it’s time to throw some light on some of the technologies that are used in this space and understand where they actually fit in the whole scheme of things, or how they are different from some of the other seemingly similar technologies.
When talking about containers, the very first term used almost synonymously to containers is Docker! But in reality they are not synonyms. In fact, Docker is nothing but a containerization tool that allows you to create, build and run containers using simple command line tools. It’s a container engine. The term “Docker” is also used in the context of a container image. The term “Docker image” represents the container image created using the docker tool.
Kubernetes (a.k.a K8s) is a container orchestration framework that originated at Google. Cloud Foundary and Apache Mesos Marathon are also similar technologies that operate in the same space as K8s with slight nuances. CloudFoundry is more app-focussed where as K8s and Marathon are very much container centric – but basically do the same thing. Most of the PaaS platforms (PCF, BlueMix, etc) are based on CloudFoundry, where as OpenShift (PaaS platform from Red Hat) is based on K8. Note: Cloud Foundary is the entire PaaS solution, but one of its component – Diego is what is responsible for container orchestration using warden containers
There is another very similar term called Vagrant – which operates the same way as the Docker tool – but operates at the VM level rather than Container level. Much like Docker tool is to containers, Vagrant is to VMs. Vagrant allows you to create/launch/stop VMs through a simple command line tool.
Puppet, Chef and Ansible are again some terms you will often hear in this context. These are different tools that can be used for automating configuration management processes – You could use either of these to build scripts that can automate the build, test, deployments, and rollback of your applications. While Chef and Puppet are Ruby-based, Ansible is Python based.
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.