Running Stateful Apps on Kubernetes - Best Practices
Kubernetes or K8s is an open-source orchestration system for containerized applications that helps in automating software deployment, management and scaling. In the initial days of containers, Kubernetes was used to support and run stateless applications. But recently, the Kubernetes community has started adding necessary features and support to run large stateful applications like analytics, databases, and machine learning.
Although Kubernetes delivers full orchestration support for stateful applications, it still has several storage-related gaps that have been purposefully left to fill by vendors.
To create pods with stateful applications, Kubernetes offers Persistent volume (PV) controllers and architectures like DaemonSet and StatefulSet which remain in operation even when Kubernetes scales and provisions cluster resources and ensures the existing client connections don't break.
In this article, we will discuss what stateful applications are, why it’s beneficial to run stateful applications on Kubernetes, its use cases, best practices, and the way to deploy stateful applications.
What is a Stateful Application in Kubernetes?
Unlike stateless applications where the data from the clients do not get saved in the server between servers, stateful applications store data in persistent disk storage and keep tracking so that it can be used by clients, servers, or other applications.
Using stateful applications, users can continue previous operations repeatedly because the previous context is recorded with each transaction. For this reason, stateful applications need to ensure that the user always contacts the same instance of the application or has a plan to synchronize instances’ data.
To deploy and run stateful applications, Kubernetes uses the Statefulset controller as stateful objects where each pod is non-interchangeable and has a maintained unique identifier.
Use Cases of Kubernetes Stateful Application
Kubernetes is highly sought after for running stateful applications. The deployment and operation in containerized stateful applications are simplified in complex environments while maintaining statefulness to ensure cohesive processes from development to deployment.
Some of the common use cases of containerized stateful applications are as follows:
AI and data analysis: Kubernetes stateful applications include databases and complex distributed applications for artificial intelligence and big data usage. For example, it offers multi-service environments that work using open-source frameworks like Spark, Hadoop, Kafka, etc and multiple commercial tools for processing large-scale data, machine learning, data science, analytics, visualization and business intelligence.
Machine Learning Operations: Statefulness in containerized applications can help the MLOps environment in sharing abstraction and training data, and checkpointing for more training jobs.
What is StatefulSet?
StatefulSets are the workload API object of Kubernetes that manages the deployment and scaling of pods in stateful applications while ensuring the uniqueness and ordering of these sets of pods.
An identical set of pods based on container specs is managed by StatefulSets, but unlike a deployment, each pod maintains its unique identity. Although these pods are made from the same spec they aren’t interchangeable as each of them has a persistent identifier that is maintained over every rescheduling.
StatefulSets are beneficial for apps that need:
Steady and persistent storage
Stable unique network identifiers
Ordered deployment, scaling and automated rolling updates
Best Practices for Running Stateful Applications on Kubernetes
To run a stateful application’s workload on Kubernetes efficiently, here are a few recommendations that you can follow:
Develop one operator for each application
A dedicated operator can help automate a variety of applications' features according to their domain expertise. It must be distinct from one application at a time to improve the separation of concerns.
Use SDKs like Kubebuilder
Kubebuilder is a framework that uses custom resource definitions (CRDs) to build Kubernetes APIs and Controllers. Using Kubebuilder, it becomes easier to write Operators as you don’t have to learn low-level details like how libraries in K8s are implemented.
Another best practice is to separate every stateful application into its namespace to ensure clear isolation and make resource management easier.
Use Declarative APIs
Use declarative APIs instead of imperative APIs for operators that will easily align with Kubernetes APIs as they are also declarative. To make declarative APIs make sure their cluster’s desired state and allow them to perform all the required steps to achieve it.
All the custom configurations and scripts should be kept in a ConfigMap as it will make sure that each application configuration is declaratively handled.
Compartmentalize features using multiple controllers
An application often has multiple features like backup, scaling, monitoring and restore, so an operator should have multiple controllers to manage these features. It will simplify the development process by simpler sync loops and improved abstraction.
If you want to manage the routing of services with the growth of your application, use a headless service instead of a load balancer.
Asynchronous sync loops
During the current cluster state reconciliation, if an error is detected by the operator, it should immediately stop the ongoing sync, return the error, the work queue should reschedule the sync for some time after and this sync should not stop the application by continuing the cluster state poll until the issue is resolved.
All the secrets should be managed using a robust secret management system so it doesn’t create security risks for the production applications.
Careful storage planning
Another important thing to keep in mind is to determine the requirements for persistent storage, ensure the equipment is ready for use by the cluster and define PVCs and storage classes to guarantee the availability of required resources for every component of the application.
Ways to Deploy Stateful Kubernetes Application
To run stateful workloads on Kubernetes, you can go for one of the three ways, including using cloud services, running in native Kubernetes or running outside the Kubernetes.
Running Workload Using Cloud Services
Using cloud services like Google Cloud or AWS you can run your stateful workload outside of Kubernetes as a managed cloud service. Scaling, managing spinning workloads, managing databases, and redundant infrastructure stack via external services will no longer be required.
For example, if you are running containerized applications on AWS and want to add a SQL database, then you can use AWS RDS (Relational Database System). As managed databases can be scaled elastically, the stateful services can easily adjust according to the increase in demand whenever Kubernetes scales up.
Running Workload in Native Kubernetes
While running a stateful workload in Kubernetes can be difficult to implement, it can provide the greatest operational efficiency and flexibility. You can use either of the two native controllers from Kubernetes, i.e, StatefulSets or DaemonSets to run the application. It maximizes automation, integration, and workload controls while removing the complexity, cost and time of maintaining individual database services stack.
Running Workload Outside Kubernetes
The most commonly used approach is to run stateful workloads outside Kubernetes. All you need to do is simply spin up a new virtual machine (VM) and have resources in the Kubernetes cluster for communication. The pro of this approach is that it enables you to run existing stateful applications as it is without the need for re-architecture or refactoring. However, the additional operation workload like process monitoring, in-datacenter load balancing, configuration management and service discovery that you’ll use can increase costs.
Kubernetes has become the best orchestration platform for stateful applications. Using the Kubernetes constructs like persistent volume, StatefulSets and DaemonSets, you can efficiently scale and manage the stateful application. By reading this article, we have learned about use cases, best practices, and ways to run stateful applications on Kubernetes. We hope you will find it helpful.
If you too want to build, deploy, manage and scale both stateless and stateful applications on Kubernetes, you can hire developers who are well-versed in the application development area with years of experience from Decipher Zone Technologies.