The Kubernetes Story
And why you should care
Kubernetes is a powerful, open-source system that was developed by Google, for managing containerized applications in a clustered environment. It supports cloud native deployment, scheduling, replication, updating, monitoring, and basically all lifecycle management stages. Over the past few years, Kubernetes has grown in popularity, becoming the de facto standard for deploying software in the cloud.
There are other tools in the market, but it seems Kubernetes product evolution is the key to its success. Kubernetes orchestration platform goal is to simplify cluster networking by making a flat network structure that frees admins from setting up dynamic port allocation to coordinate ports, plan custom routing rules and sub-nets, and more. As Kubernetes was meant to simplify application lifecycle management, this article is meant to simplify Kubernetes architecture and components.
What problems does Kubernetes solve?
As we discussed my first blog series (the Hitchhiker’s Guide to Telco Transformation), containers are lightweight, scalable, and isolated. Containers are linked together for setting security policies, limiting resource utilization, etc. Container orchestration helps you manage everything by yourself.
For example, many applications keep growing and evolving. For example, by the addition of more features and functionalities. In many cases they become huge monoliths. At which point they become impossible to manage, because of the shear CPU and RAM requirements. One solution for this is splitting the application into smaller chunks. Each with a specific task.
Now a caching layer with some queue system for a better asynchronous performance is required. But there are additional challenges like service discovery, load balancing, health checks, storage management, auto-scaling, etc.
Under all such chaotic circumstances, who can help? Correct, container orchestration!
What are the options for container orchestration?
The most popular orchestration solutions are Kubernetes, AWS ECS, and Docker Swarm. Of which Kubernetes is more popular than the others!
Why? Because Kubernetes:
- Offers the largest community.
- Solves all the major issues described.
- Is portable and runs on most PaaS (Platform as a Service): bare-metal, hybrids, and the combination of all these.
- Is also configurable and modular.
- It provides features like auto-placement, auto-restart, auto-replication, and auto-healing of containers.
Kubernetes is a system that is used for running and coordinating applications across numerous VMs. The system manages the lifecycle of containerized applications and services by using different methods to foster predictability, scalability, and high availability.
Kubernetes provides the user with the freedom to decide and define how the applications should run and communicate. Easily enabling scaling (up or down), performing rolling updates, switching traffic between different application versions, and more. Kubernetes also offers different interfaces and platforms for defining/managing applications.
What is a node in Kubernetes? One of the smallest units of computing in Kubernetes is known as a node. It is a singular machine which resides in a cluster. A node doesn’t necessarily need to be a physical machine or a part of the hardware. It can either be a physical machine or a virtual machine. For example in the data center a node is a physical machine, whereas in the Google Cloud Platform a node is a virtual machine.
There is always a layer of abstraction for nodes in any machine. But here, there is no need to worry about the characteristics of a machine. Instead, each machine is seen as a set of CPU and RAM resources. These machines are in a cluster and their resources can be used as per requirements. This in turn, provides enhanced flexibility and ‘dynamisty’ to the system whereby any machine can substitute any other machine within a Kubernetes cluster.
As we’ve discussed, nodes are small processing units which reside within a cluster. But in a cluster you don’t need to worry about the state of individual nodes because they all managed as part of the same cluster. The cluster is intelligent! All the nodes are pooled together their resources and together make a powerful machine.
When the programs are deployed onto the cluster, cluster management dynamically handles the distribution. In short, it assigns tasks to individual nodes. During the process, if any node is added or removed, the cluster shifts the work as per the need. One less thing for the programmer to think of when working.
As mentioned, the programs run on the cluster and are powered by the nodes. But they don’t run on specific nodes. The programs run dynamically therefore there is a need to store information output by the node, but not in the node itself nor can it be stored randomly in a specific file system.
For example, when a program which runs on node A saves data to a file. But later on, the program is relocated to another node. This time when the program on Node B search for the data, it won’t be in the expected place because the location/address has changed. The traditional local storage for each node is considered as a temporary cache for holding programs. But cached data does not persist.
So where is the data stored for later use? The persistent volumes store it permanently. The cluster manages the CPU and RAM resources for all the nodes. But, the cluster is not responsible for storing data permanently in the persistent volume. Local drives and cloud drives can be connected to the cluster to act as persistent volume. It is something like plugging an external drive to the cluster. The persistent volumes offer a file system which can serve the cluster without being associated to any specific node.
In Kubernetes, Linux containers host the programs. These containers are globally accepted and already have pre-built images. The programs, as well as its dependencies, are packed in one single file and shared on the internet (e.g Dockerhub). So anyone can download the container and deploy it on their infrastructure when needed. Deployment is hassle-free with just a bit of setup. A local container can be created with the help of a program. This enables the formation of effective CI and CD pipelines.
The containers are capable of handling multiple programs. But it is recommended to limit one process per container because it helps during troubleshooting. Updating the containers is easy and the deployment is easy (if it is small). It is better to have many small containers, rather than a big one.
Kubernetes has some unique features and one of them is that it doesn’t run the containers directly. Rather it combines or wraps up one or more containers into a pod. The concept of a pod is that any containers within the same pod can use the same resources and the same local network. The benefit is that the containers can communicate with each other easily. They are isolated, but readily available for communication.
The pods can replicate in Kubernetes. For example, in the case where an application becomes popular and is no longer sustainable on a single pod. In this instance, Kubernetes can help deploy replicas of the pod as per the requirement. But replication is not only possible during heavy loads. A pod can replicate during normal conditions as well. This helps in uniform load balancing and preventing failures.
Pods are capable of holding multiple containers but should be limited to one or two if possible. The reason is that the pods scale up and down as a single unit and the containers within the pod scale together with the pods.
You may notice that pods are the basic units in Kubernetes. However, they are not launched directly on a cluster. They are managed by more than one layer of abstraction with the purpose of deciding and declaring the number of replicas running at a given time.
When a deployment is needed, it spins up the number of pods and monitors them. Similarly, if the pod doesn’t exist anymore, deployment can re-create it. The good thing, is that with deployment, there is no need to deal with pods, everything is managed automatically.
We have discussed all the basic concepts of Kubernetes. Using them, you can create a cluster of nodes. Once the cluster is made, it is time to launch deployments of pods on the cluster. But how will you allow external traffic to access your application?
As per the concept of Kubernetes, it offers isolation between pods and the outside world. To communicate with a service running within a pod, the outsider needs to open a channel. The channel is a medium of communication. It is known as “ingress”.
There are numerous ways for adding an ingress to the cluster. The most common being through an Ingress Controller or Load Balancer.
The most important thing of all, Kubernetes has an active online community. The members of this community meet-up online as well as in person, in major cities of the world. An international conference “KubeCon” has proved to be a huge success. There is also an official Slack group for Kubernetes. Major cloud providers like Google Cloud Platform, AWS, Azure, DigitalOcean, also offer their support channels.