Welcome to the final article of the series on migrating Elasticsearch to Kubernetes!
To give you a quick recap: In previous articles, I discussed various details from the implementation stage, including the Elasticsearch and Kibana deployments via Helm charts. I also discussed the post-cluster setup, which we automated via ConfigMap and CronJob.
Following on from these steps, I’ve dedicated this last article to describing the lessons we learned when we migrated Elasticsearch to Kubernetes. I’ll also discuss all the migration’s benefits and the drawbacks we encountered.
Which Main Lessons Did We Learn?
First things first, let’s go through some of the most important lessons we learned while migrating Elasticsearch to Kubernetes.
1. Planning Is Key
There are many ways to deploy Elasticsearch to Kubernetes. Our DevOps team wanted to go for the Kubernetes operator way with Helm charts, as it is one of the most scalable and maintainable ways to deploy applications on Kubernetes. We also wanted everything to be provisioned by Terraform, so we used Terraform’s helm_release API.
After reviewing enough documentation, we deployed a simple one-node Elasticsearch cluster. Once we validated our initial setup, we scaled to three master- and multiple-data node clusters. Then, we worked on the various tasks mentioned in the previous articles. The entire planning stage ultimately involved planning, executing, validating, and then repeating steps – proceeding in incremental stages and recognizing these successes as checkpoints.
At the migration stage, we planned to route all of our logs from different applications toward the new Elasticsearch setup, and we wanted to retain our previous cluster, which had older logs. We only brought down the old cluster after our minimum log retention period had passed. This ensured two things: a) we had old logs from applications; b) we had our old cluster as a backup in case anything went wrong with the new one.
2. StatefulSets Are Essential
Elasticsearch is a stateful application, which means that it needs to maintain its state across restarts. In Kubernetes, StatefulSets ensure that pods are created and destroyed in a specific order and that their data is preserved.
3. PersistentVolumes Are Crucial
Elasticsearch stores its data on disk, so it’s important to use PersistentVolumes to ensure that data is not lost when pods are recreated. Our team uses PersistentVolumeClaims, which requests for storage resource (PersistentVolume) each time the pod is created/recreated.
4. Plan the CPU and Memory Resources Required for Master/Data Pods
Resources planning is essential, which can vary depending on the amount of logs ingested by your cluster, number of indices, and how much and how fast the indices grow. You also need to take into consideration the resources that the JVM application needs.
If you are migrating from an existing cluster, you might have an idea about the resources needed and can set the pod resources accordingly. Still, it might take a few attempts of configuring and monitoring to reach the final resource allocation.
5. Monitoring and Logging Are Important
Monitoring and logging are essential for troubleshooting and debugging issues. In Kubernetes, you can use tools like Prometheus and Fluentd to monitor and log your Elasticsearch cluster.
6. Test Thoroughly
Before deploying to production, it is important that you thoroughly test the Elasticsearch cluster in a development environment to ensure that it is configured correctly and that there are no issues.
7. Specify Annotations
If you want to manage load balancers via AWS, you need to create an ingress object. Make sure the load balancer is private by specifying annotations:
What Were the Pros of the Migration?
You can deploy Elasticsearch and Kubernetes in different ways, with some popular solutions including Helm charts, operator, or custom controllers. Make sure you choose the solution that works best for your use case.
As you might remember, we wanted to migrate to Kubernetes to reduce hosting costs and ensure convenience and security for our developers here at adjoe. So, with this in mind, these are just some of the pros we identified as a team when we migrated to Kubernetes.
- The ECK Operator brought us mostly benefits and automated various mundane tasks. For example: By decreasing the number of data nodes in StatefulSets, the ECK Operator relocates all the shards from the top of the data pods stack and terminates those pods. Without the ECK Operator, removing the nodes from the cluster would require draining the nodes – the cluster would need to be communicated to exclude the nodes from allocating any shards and to reallocate the existing shards to other nodes. Only then would the nodes be marked as safe to remove from the cluster.
- We previously needed to upgrade Elasticsearch versions manually by bringing down the nodes one by one and upgrading the Elasticsearch version. This required much time and effort. The Elasticsearch StatefulSet, however, performs a rolling update of the pods incrementally. Also, the operator that sits above the StatefulSet ensures the cluster is in green state before proceeding further to another pod.
- The ECK Operator incorporates most of the security best practices by default into the cluster.
- Kubernetes is a better platform for automating various tasks with the help of ConfigMaps and CronJobs.
What Were the Cons of the Migration?
Despite the extra layer of abstraction between the Elasticsearch cluster and the underlying infrastructure, Kubernetes events allow us to troubleshoot issues by giving us more insights into them. Such as when a pod is not getting scheduled.
However, the events have a retention period of 60 minutes in EKS and will be lost if not saved to some other location. Also, if a pod is crash-looping due to the application, we might not be able to extract the logs of the crashed container if Kubernetes is not connected to a central logging system.
What to Know When Migrating to Kubernetes
In conclusion, migrating Elasticsearch to Kubernetes can offer various benefits, but it is not without its challenges. By planning carefully, using StatefulSets and PersistentVolumes, monitoring and logging, and thoroughly testing the Elasticsearch cluster, you can ensure a successful migration and reduce hosting costs like our DevOps team did.
Project Manager: Cloud Technologies (f/m/d)
- Cloud Engineering