Containerization has been around for many years, and is by no means anything new in the developer community. However, mainstream adoption is something new, and happened shortly after Docker was launched in 2013. Orchestrating applications isn’t a new principle either; as long as containerization has existed, engineers have needed a way to manage their applications in an effective way.
Orchestration is the practice of keeping track of your applications, and making sure that if anything fails it will automatically be fixed, at least to the extent that doing so is possible.
Because this has been such a common practice for many years, it didn’t take long after mainstream adoption for the first container orchestrators to appear. In 2014, the year after Docker was launched, both Kubernetes and Docker Swarm were launched. Today, Kubernetes is the de facto default container orchestration platform, and is widely used all over the globe.
In this article, you’ll learn a bit about why you need to consider using a container orchestrator, as well as some reasons why you might want to avoid it. You’ll be given a short history lesson on how container orchestration has developed through the years, and finally you’ll be taken through an overview of how orchestration typically works, using Kubernetes as an example.
Why do you need container orchestration?
A container orchestrator takes care of managing a set of containers, based on a set of rules you’ve defined. For example, it can control scaling, health checks, and much more.
It’s assumed that before reading this article, you know about containers in general and all the benefits that they bring, such as logical separation, low-resource deployments, and self-contained application images. But do you know why you want to use an orchestrator for the specific containers you’re working with? Not everyone needs an orchestrator in their organization, but for many, it comes with a unique set of advantages.
First step back - the alternative
Before diving into why you need an orchestrator, it’s important to first understand what the alternative is. It’s certainly possible to run containers in production without using an orchestrator, but chances are it will be a subpar experience. The classic way is to spin up a virtual machine (VM) and then use port binding to expose services. This will likely require a reverse proxy to make sure clients can talk to your application.
If you want to avoid an orchestrator, you can also use cloud offerings like Azure’s App Service or Google’s App Engine. This will, in most cases, give your application a direct IP to which clients can then connect to. However, you still need to have someone managing the application if it suddenly stops, and have someone who can scale your application if you experience traffic spikes. On top of that, you will also have to manage redundancy. This is certainly possible to do, but orchestrators specialize in use cases like this.
Benefits of container orchestration
An orchestrator will help with all of the above-mentioned scenarios, The most basic use case is handling unhealthy or failing containers. An orchestrator will periodically check to ensure that all containers are working properly. If it finds one that isn’t, in most cases, it will remove the malfunctioning container and spin up a new instance.
Another common scenario is supporting multiple instances of your application, either to handle a high load or because you want redundancy. An orchestrator will also help in this case, as it will periodically check to ensure that you have the specified number of containers running. If you don’t, it will spin up new containers to make sure it matches your specification.
Even in cases where everything is running as expected, it’s nice to have a backup for when things do go wrong. With an orchestrator, it’s also easy to configure backups, so client requests automatically failover to back up resources. You can also improve your network management, as it’s easy to define custom networks inside an orchestrator. Often, each service will also be given a hostname, meaning you can refer to a standard name instead of having to configure IPs.
In some cases, an orchestrator can even help manage costs, as you can more easily run multiple services on fewer nodes, as well as scale more easily based on resource usage.
Downsides to orchestration
However, there are also downsides to orchestration that needs to be considered. The major downside is that it requires a lot of additional expertise. Something like Kubernetes has four different certifications you can acquire, which are known to be some of the hardest certification exams in the industry. Because of this, orchestrators will add some administrative overhead to your organization, likely meaning you’ll have to hire additional infrastructure engineers. In some organizations, Kubernetes management can be an entire job on its own.
It’s up to you whether this outweighs the benefits you are getting from implementing an orchestrator. When you’re working with microservices, you’re working with many different applications that all need to run as effectively as possible. Managing these services independently can quickly become a major undertaking, which is why an orchestrator can provide tremendous value in this use case.
The same thing can be said for having a diverse infrastructure, where you’re managing multiple different kinds of services, all of which can be categorized into different clusters of services. An orchestrator will help you define equal rules to all services that are alike.
In general, there are few downsides to orchestrators. Key ones to consider are:
- It can add unnecessary complexity to an infrastructure, especially if you’re only managing a few services.
- Moving your workloads to an orchestrator can be tedious, and it’s important to consider what benefits you gain from moving to an orchestrator
- Depending on your workload, using an orchestrator can quickly become more expensive than your normal infrastructure. This is true of both labor costs and in terms of the cost of resources.
The history of container orchestration
The history of container orchestration is, as of writing, less than a decade old—at least in the form we know it today. As mentioned previously, Linux containers have been around for a long time; however, containerization only gained mainstream adoption in 2013 and 2014. Containerization has two main orchestrators that are in use today, Kubernetes and Docker Swarm. Kubernetes was launched on June 7, 2014, and Docker Swarm was launched on October 16, 2014. Because these two options were launched around the same time, it was up to the community to figure out which of these options would become the de facto standard.
According to the most recent statistics, it is clear that the community chose Kubernetes. Since the community was all on board with Kubernetes, companies started focusing on the tool as well. For many years prior to mainstream containerization adoption, engineers could use DC/OS, a popular tool by D2iQ, to orchestrate their applications. However, in 2015, the company decided to sunset the platform in favor of focusing fully on providing a Kubernetes platform. This really cemented the fact that Kubernetes was the go-to solution for container orchestration.
Google launched their Kubernetes offering in 2014, which isn’t surprising, since Google is the company originally behind Kubernetes. In 2018 both Amazon Web Services and Azure launched their Kubernetes offerings, Elastic Kubernetes Service (EKS) and Azure Kubernetes Service (AKS), respectively. At this point, many engineers saw Kubernetes as the only option when it comes to container orchestration. Of course this wasn’t (and isn’t) true, as you still have D2iQ and Docker Swarm, but these tools are significantly less popular than Kubernetes.
The future of orchestration
Over the past years, Kubernetes has started being integrated into the development process, as opposed to only being used in production. Many teams are now using local Kubernetes solutions to test their applications inside the orchestrator while developing their applications. These are tools like minikube, which can spin up a local cluster, and Skaffold, which can automatically update your application inside a local Kubernetes cluster while you’re developing.
This is a sign of how deeply integrated orchestration is becoming in the modern cloud infrastructure. An increasing number of organizations are taking full advantage of orchestrators, which needs to be reflected in the development process as well. It’s very common today that general-purpose applications are containerized and run inside an orchestrator, while specific cloud resources are more likely to be used for functions and databases.
How does orchestration work?
As previously mentioned, Kubernetes is often seen as the default option when it comes to orchestrators. As such, this explanation of how orchestration works will be based on how Kubernetes works, but the same principles apply to any other orchestrator.
The control plane
The primary part of an orchestrator is the control plane. The control plane of an orchestrator consists of several parts that control different areas of the cluster, and collectively, control the entire cluster. In Kubernetes there are four main parts to the control plane. There’s the API server, which allows the various functions of Kubernetes to communicate with each other, and also allows you to communicate with the cluster.
Another part of the control plane is the scheduler, which takes care of scheduling the pods that contain your applications to the right nodes. Then there’s the controller manager that manages the different controllersinside Kubernetes, such as the Ingress Controller. The final part of the Kubernetes control plane is etcd, a key-value store that’s used to store the configuration and state of your cluster.
The other major part of any orchestrator are the worker nodes. This is where your applications will actually be run. In some cases, like when you want a low-resource cluster, the control plane and worker node will be the same machine. In most cases, though, the control plane will reside on a primary node, which only has the control plane, and your application will be run on separate worker nodes. These worker nodes need to have at least two things in order to function as part of a container orchestrator.
First, the worker nodes each need to have a container runtime; otherwise, the node simply won’t be able to run your containers. It’s up to the orchestrator what runtime is used, but in Kubernetes it’s using containerd by default, however you can also choose for yourself.
Second, the worker nodes need to have a way to communicate with the control plane so that it knows when it needs to spin up new containers or shut down running ones. In Kubernetes, this tool is called the kubelet. The kubelet takes care of managing the containers that are currently running on the worker node, as well as communicating with the control plane when changes need to be made to the workload.
Bringing it together
A control plane and worker nodes are the key parts of any orchestrator, and while the implementations will be slightly different, the basic principles are the same. Kubernetes has one extra part that isn’t strictly necessary to make orchestration work, which is the kube-proxy. The kube-proxy is run on each worker node and helps manage the networking inside your Kubernetes cluster, allowing communication to your pods.
Orchestration isn’t a very complicated concept to understand, but it can be tough to implement into your organization effectively if you’re new to it. Understanding the key components of your chosen orchestrator will help immensely when you start working with it. For example, knowing how containers are scheduled onto worker nodes can help you a lot when debugging why something isn’t working.
As you can see, containerization and orchestration isn’t anything new in the developer community, but its popularity has grown dramatically over the past decade. Today, orchestration is almost synonymous with containerization, and engineers all over the world use Kubernetes to orchestrate their containers and make sure their infrastructure is working in the most optimal way possible.
If you are running your containers in Kubernetes, or if you intend to start deploying your applications in Kubernetes, you need a way to keep track of what is happening inside your cluster.