In Kubernetes, a volume represents a disk or directory that containers can write data onto or read data from, to handle cluster storage needs. Kubernetes supports two volume types — persistent and ephemeral — for different use cases. While persistent volumes retain data irrespective of a pod’s lifecycle, ephemeral volumes last only for the lifetime of a pod and are deleted as soon as the pod terminates.
In this article, we’ll discuss how Kubernetes handles ephemeral storage and learn how these volumes are provisioned in operating clusters.
How Kubernetes handles ephemeral storage
Ephemeral storage is considered perfect for immutable applications and is used to handle the transient needs of pods running on cluster nodes. Such applications intermittently rely on storage devices but don’t care whether data persists across pod restarts. In addition to this, pods in Kubernetes leverage ephemeral storage for functions such as caching, scratch space, and logs. Ephemeral storage is also considered crucial for sharing nonessential data within multi-container pods and injecting configuration data into a pod.
In a Kubernetes cluster, you manage ephemeral storage by setting resource quotas and request limits across all non-terminal pods. This can be done both at the pod level and at the container level for fine-grained storage management. Ephemeral storage is unstructured and shared between all pods running on the node, the container runtime, and other processes managed by the system. While pods use the ephemeral storage framework to specify their transient local needs, Kubernetes relies on the storage framework to schedule pods appropriately while preventing pods from excessively using local node storage
Ephemeral storage options
Kubernetes offers two approaches to creating the node’s primary partition, where ephemeral storage is deployed. They are:
The root partition holds the logs (
/var/log) and kubelet (
/var/lib/kubelet) directories and is shared among Kubernetes system daemons, user pods, and the underlying operating system. In such a setup, pods use
EmptyDir volumes to consume ephemeral storage in the root primary partition. Pods also utilize the root partition when creating image layers, container-writeable layers, and container logs for transient applications. A root partition is completely ephemeral, so it doesn’t support any performance SLAs, such as disk IOPS.
Runtimes often use an additional partition for overlay file systems. Kubernetes uses this runtime partition to identify and provide both shared access and isolation as required. Pods store container images and container-writeable layers within the runtime partition by default. In instances where both runtime and root partitions exist, the runtime partition is considered the default option for writeable storage.
Types of Kubernetes ephemeral volumes
To support different use cases, Kubernetes allows the provisioning of different types of ephemeral volumes. Some commonly used ephemeral volumes include:
Generic ephemeral volumes are provisioned through storage drivers that support persistent storage on Kubernetes. These volumes can be written using any storage driver that supports dynamic provisioning, and they provide a per-pod directory for scratch data that is initially empty. Generic ephemeral volumes can be either local or network-attached and can perform typical volume operations supported by the installation driver, such as cloning, snapshotting, and resizing.
Since version 1.15, Kubernetes has supported CSI drivers for inline ephemeral volumes. These drivers dynamically create volumes and mount them to a pod, so the volumes remain dependent on the pod’s lifecycle. CSI volumes are defined as part of the pod spec and are deleted during pod termination. Examples of CSI drivers for ephemeral inline volumes include:
- PMEM CSI - A persistent memory driver that provides a hybrid of persistent data storage that is faster than normal SSDs, and ephemeral scratch space with a larger storage capacity than DRAM.
- Image populator - A storage driver that automatically unwraps a container image to utilize its content as an ephemeral storage volume.
- Cert-manager-csi - A volume driver that works with the Kubernetes certificate manager to facilitate a seamless requesting and mounting of key pairs to pods.
ConfigMap, downward API, secret
These volumes are collectively used to inject different types of data into a pod. They are provisioned as local ephemeral storage and are managed by the kubelet service on each node.
- The ConfigMap injects configuration data into pods that are referenced within the ConfigMap volume and are mounted to the pod on a path specified within the pod manifest.
- The downwardAPI volume stores data exposed by the Downward API as read-only files in plaintext format.
- A secret volume is used to pass sensitive information, such as passwords, private keys, and authentication tokens, into pods.
This volume is created as soon as a pod initializes and is available for as long as the pod stays non-terminal. While all containers running in a pod read and write the same directories within the volume, the volume can be mounted on different paths in multiple containers. In such constructs, whenever a pod terminates, the data in the
emptyDir volume is deleted permanently.
Provisioning ephemeral volumes in a Kubernetes cluster
Ephemeral volumes are specified within a pod specification, simplifying their deployment and management. Kubernetes supports the provisioning of multiple ephemeral volume types in clusters, depending on the workload and use case. In the following section, we discuss approaches to deploying CSI and generic ephemeral volumes, the two most commonly used ephemeral volumes.
CSI ephemeral volume
CSI ephemeral volumes are supported by some CSI drivers that rely on the
CSIInlineVolume feature gate to be active. In Kubernetes versions 1.16 and later, the feature gate is enabled by default. The CSI ephemeral volume is managed within the node’s local storage and is created locally after pod scheduling. The manifest of a pod that uses CSI ephemeral storage would look similar to:
csi spec points the pod to the CSI driver and includes the volume attributes. The key-value declarations under the
volumeAttributes spec determine the specification of the volume to be deployed by the CSI driver.
CSI driver limitations
Kubernetes determines volume attributes directly from the driver by referencing the
volumeAttributes pod spec. This approach is known to potentially expose restricted parameters and allow non-admin users to modify them through an inline ephemeral volume. Administrators can prevent a CSI driver from being used as an inline ephemeral volume by removing
ephemeral from the CSI driver’s
Generic ephemeral volume
The sample manifest for a pod with a generic ephemeral volume would look similar to:
The above manifest represents a scratch volume named
darwin-scratch-volume that is mounted to the
scratch mount path of a pod named
darwin-pod. The specification also defines the volume to support the ReadWriteOnce access mode and is assigned a 1Gi storage capacity.
PersistentVolumeClaim and volume lifecycle
In ephemeral volumes, volume claim parameters are defined within the pod’s volume source. When a pod initializes, the Kubernetes ephemeral volume controller creates a PVC object within the pod’s namespace. Besides ensuring volume binding, the PVC holds the current status of the volume and can be referenced as a data source for volume operations such as snapshotting and cloning.
An ephemeral volume controller ensures that the Kubernetes garbage collector deletes the PersistentVolumeClaim when the pod exits. The controller is also responsible for providing the labels, annotations, and other fields associated with the PVC object. The labels and names of PersistentVolumeClaims are deterministic, making it easier to search and sync with automatically created PVCs.
Monitoring ephemeral storage
Kubernetes supports various tools that monitor capacity and usage of ephemeral volumes. Within active nodes, a volume is usually located in the
/var/lib/docker directory. One common approach is to use tools such as
/bin/df to check disk usage and other metrics in ephemeral storage directories.
To access storage capacity values in a human-readable format, administrators can use the
df tool combined with a
-h flag. For instance, to check storage statistics on the
/var/lib/ directory, administrators can use a command similar to:
Which returns a response with ephemeral storage usage data similar to:
Ephemeral volumes are designed for applications running in pods that don’t need to persist data across restarts. These volumes are useful for transient pod needs such as caching, logging, and scratch space. Similar to persistent volumes, the lifecycle of an ephemeral volume is managed using a PVC object. With ephemeral volumes, pods can stop and restart gracefully without being restricted by the location of a persistent volume.
While the choice of provisioning persistent or ephemeral volumes allows administrators to configure cluster storage based on workload requirements, continuous monitoring is one of the most critical factors in maintaining a distributed cluster’s performance.
Since monitoring is one of the most important factors, you want to ensure you have the best platform in place to do so effectively. Airplane is a great solution to help you monitor your clusters. You can build an internal dashboard to help you monitor and alert users of errors. You can also build custom workflows to help ensure your clusters are performing properly.