By Abhijith Shenoy
The container ecosystem has evolved immensely over the last few years to support production-ready workloads. The evolution is not over yet and has given birth to several container platforms for implementing modern applications and services with a focus on solving developer problems. While each platform is built on a self-service, API-driven and programmable infrastructure, Kubernetes has emerged as the de facto standard for container orchestration, based purely on its technical pedigree.
As organizations embark on the journey to containerization, it is necessary to recognize the importance of data persistence for enterprise applications. Kubernetes excels in this aspect with its ability to seamlessly provide persistent storage capabilities to workloads, irrespective of whether the workloads are scheduled on-premises or in the cloud. This ability is attributed to the Persistent Volume framework of Kubernetes that standardizes the way in which persistent storage is dynamically provisioned and consumed by application pods.
Hedvig has complete integration with the Kubernetes Persistent Volume framework. The Hedvig Dynamic Provisioner, which is an out-of-tree provisioner, allows Kubernetes users to dynamically provision Hedvig virtual disks and consume them as persistent volumes using native Kubernetes constructs. Check out the detailed walk-through of our technical architecture and a demo of our Kubernetes integration from our most recent Tech Field Day presentation.
Although the Persistent Volume framework had a positive impact on application development — by providing a programmable interface to develop stateful applications while accelerating test/dev to production release times — it had a negative impact on persistent storage providers. Volume plugins in Kubernetes are in-tree plugins, which means that their code is packaged and shipped with Kubernetes binaries. Therefore, any enhancements to in-tree plugins (even a bug fix or a cool new feature) by storage providers had to align with Kubernetes release timelines. In addition, it was almost impractical for the Kubernetes developer community to test and certify third-party plugins. Enter CSI!
Container Storage Interface (CSI)
CSI is a community-driven project with the main goal of standardizing persistent volume workflows across different container orchestrators (CO) such as Kubernetes. With CSI, storage providers (SP) can develop, maintain and deploy plugins across different container orchestrators with no dependency on the orchestrator core code. This leads to better turnaround times for bug fixes and new features.
Learn more about the CSI specification that explains the interactions between container orchestrators and storage providers.
In a nutshell, a CSI driver consists of the following components:
Node Server – This is a gRPC server that enables access to persistent volumes. If you have deployed a Kubernetes cluster with three worker nodes, the node server should be running on each of these three nodes, since stateful applications can be scheduled on any of these nodes.
Controller Server – This is a gRPC server that manages the lifecycle (creation/deletion, among other operations) of persistent volumes. Therefore, it is unnecessary to run this on all nodes.
In the following section, we will describe how these components are deployed for Kubernetes and how they interact with each other to seamlessly create stateful applications.
Hedvig Dynamic Provisioner and CSI Driver
The Hedvig-CSI Driver supports v1.0.0 of the CSI specification. The following figure provides an overview of how Hedvig integrates with any Kubernetes cluster through the CSI driver.
- The Controller Server is installed as a Deployment and is responsible for dynamically provisioning CSI volumes. It is also responsible for other operations, such as attaching and snapshotting volumes, which need not be executed on the node where the volume is consumed.
- The Node Server is installed as a Daemonset as a part of the Dynamic Provisioner and is responsible for mounting and unmounting CSI volumes on Kubernetes nodes where the volumes will be consumed by applications.
- The Hedvig Storage Proxy is deployed as a Daemonset and is responsible for handling I/O requests for all CSI volumes attached locally.
The following sequence of events occurs when a Kubernetes user issues a request to provision storage using the CSI driver. These events explain how Hedvig components interact with Kubernetes and utilize the Kubernetes constructs to let end users seamlessly manage Hedvig storage within a Kubernetes cluster.
- The administrator creates one or more storage classes (StorageClass) for Hedvig.
- The user creates a PersistentVolumeClaim by specifying the StorageClass to use and the size of the PersistentVolume requested.
- The Controller Server provisions a Hedvig virtual disk on the underlying Hedvig Storage Cluster with the size requested and the attributes specified in the StorageClass.
- In response to the newly provisioned Hedvig virtual disk, a new PersistentVolume is created in Kubernetes. Kubernetes then binds the PersistentVolumeClaim to the PersistentVolume created.
- The Controller Server presents the Hedvig virtual disk as a LUN to the Hedvig Storage Proxy on the Kubernetes node where the application is scheduled.
- The Node Server (running on the node where the application is scheduled) mounts the persistent volume, which is then consumed by the application.
A default StorageClass for Hedvig-CSI can be created using the following specification.
Any persistent volume created using this storage class will result in the creation of a Hedvig virtual disk with compression and deduplication enabled. In order to provision a persistent volume using the aforementioned storage class, create a persistent volume claim using the following specification.
In order to consume the persistent volume, create an application pod using the aforementioned persistent volume claim. The following specification creates an Nginx application pod and mounts the persistent volume claim under “/data” within the application container.
– name: hedvig-csi-pv
– name: ctr-nginx
– mountPath: “/data”
The Hedvig CSI driver can be found on Docker Hub.
For more containers and Kubernetes discussion, check out the on-demand webinar recording we did on providing application container availability across clouds. This webinar provides a high-level Kubernetes explanation and demonstrates the distributed nature of our software defined storage platform running our CSI.