A bunch of Google developers started the project as a way to orchestrate containers, which they open-sourced to the cloud-native computing foundation. Today it is one of the most popular systems and the de-facto standard for running your containers.
There are various reasons for that. We’ll look into what Kubernetes is in the first place, why it is so popular, and what problems it solves, in plain and simple terms.
What is Kubernetes?
Kubernetes is an open-source container-orchestration system for automating application deployment, scaling, and management.
There are two main words here: container and orchestration. We need to understand what each one is to understand Kubernetes.
What are containers?
According to Kubernetes, “containers are a technology for packaging the (compiled) code for an application along with the dependencies it needs at run time. Each container that you run is repeatable; the standardization from having dependencies included means that you get the same behavior wherever you run it.”
If we need to spin up a stack of applications in a server, such as a web application, database, messaging layer, etc., this will result in the following scenario:
There is a hardware infrastructure on which an operating system (OS) runs, and you install the libraries and application dependencies on the OS. Different applications then share the same libraries and dependencies to run.
The matrix of hell
The design has multiple problems. A web server might need a different version of a library as compared to the database server. If we need to upgrade one of the dependencies, we need to ensure that we do not impact another app that might not support it. This scenario is known as the matrix of hell and is a nightmare for developers and admins alike.
It works on my machine
“It works on my machine” is a typical conversation between a developer and a tester, where a developer says that the application works perfectly fine on their machine. Still, it doesn’t work in the test environment, and things that might have worked perfectly fine in the test environment may not work the same in the production environment. Reason? The matrix of hell.
Before containers, organizations solved this issue by using virtual machines. A virtual machine is an emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer using software called a hypervisor, such as VMWare. A typical VM-based stack looks like this:
We have resolved the dependency problem, and we are out of the matrix of hell. This architecture was groundbreaking and is still in use today.
Problems of the virtual machine era
We were trying to solve the runtime, library, and dependency problems, but we introduced a heavyweight guest OS layer in between, which has its disadvantages. A virtual machine is heavy and slower to start.
Also, we need to allocate a minimum amount of resources to the guest OS, and organizations over-provision resources to VMs to meet the peak utilization of the VM rather than the regular use. VMs waste a lot of computing resources because a significant percentage of allocated resources remain unutilized.
Containers balance the problem out by treating servers as servers. We no longer have a separate VM for the webserver, database, and messaging. Instead, we have different containers for them. Confused? See the diagram below.
We have gotten rid of the guest OS dependency, and containers now run as separate processes within the same OS. Containers make use of container runtimes. Some of the popular container runtimes are Docker, Rocket, and containerd. The most popular of them, and more or less the de-facto standard in container runtime technology, is Docker.
A container runtime provides an abstraction layer that allows applications to be self-contained and all applications and dependencies to be independent of each other.
Containers solve a lot of problems:
Containers are portable – A container doesn’t care where it is running and behaves the same in all environments.
Containers are more efficient – Since containers do not contain a guest OS, they boot up extremely fast. While it takes minutes to boot a VM, it takes seconds to spin up a new container.
Containers are scalable – As containers boot fast and have a low footprint, spinning up new containers is very easy. You can have multiple containers scaling up and down independently within an OS and sharing the resources.
Containers are lightweight – Containers have a very light footprint. You do not need to allocate set resources to a container, and it can use the underlying OS resources, similar to an application. They require less computing resources to run in comparison to VMs.
Because of this, containers give us a lot of power. You need to use only the required amount of containers at a given time for best resource utilization.
As containers are temporary, if something goes wrong within your container, you can simply destroy the existing one and create a new one. That allows you to do canary deployments, blue-green deployments and A/B testing with ease.
You cannot do all these manually. You need something that can take care of all these aspects for you, and therefore the answer to it is a container orchestration platform, such as Kubernetes.
Introducing container orchestration using Kubernetes
The idea of using Kubernetes is simple. You have a cluster of servers that are managed by Kubernetes, and Kubernetes is responsible for orchestrating your containers within the servers. You treat servers as servers, and you run applications within self-contained units called containers.
Since containers can run the same in any server, it doesn’t matter what server your container is running on. If you need to scale your cluster, you can add or remove nodes to the cluster without worrying about the application architecture, zoning, roles, etc. You handle all of these at the Kubernetes level.
Kubernetes uses a simple concept to manage containers. There are master nodes (control planes) which control and orchestrate the container workloads, and the worker nodes where the containers run. Kubernetes run containers in pods, which form the basic building block in Kubernetes.
Kubernetes essentially does the following:
- Communicates with the underlying container runtime – Kubernetes is a container orchestration platform. It communicates with the underlying container runtime to manage the containers.
- Stores the state of the expected configuration – Kubernetes is declarative. You just need to define what setup you need, and Kubernetes does it for you. It uses an etcd datastore to store the expected configuration.
- Maintains state based on the expected configuration – Kubernetes continuously tries to keep the anticipated state of the cluster by looking at the configuration in the etcd datastore.
- Provides an abstract software-based network orchestration layer – Kubernetes uses an overlay network to allow pods to communicate with each. It doesn’t matter what servers your containers run on.
- Provides built-in service discovery – Containers are ephemeral resources. Therefore, if your container misbehaves and you decide to recreate it, the container runtime assigns the new container a different IP which might be a problem. Kubernetes solves it by providing a static IP and a DNS to route requests to a pool of container instances.
- Health checks the configuration – Kubernetes ensures that the container workloads running within the cluster are of expected health, and if not, it destroys and recreates the containers.
- Requests cloud provider for objects – If you are running Kubernetes within a cloud provider such as GCP or Azure, it can use the cloud APIs to dynamically provision resources such as load balancers and storage. This way, you have a single control plane for managing everything you would need to run your applications within containers.
Capgemini and VMware work closely together to help organizations design, build, and manage cloud native applications in public, private, or hybrid cloud environments. Discover more here.
Happy to connect!