Skip to content

Storage — PV, PVC, CSI Drivers

In the enterprise bank architecture, the platform team defines approved StorageClasses with encryption, performance tiers, and backup policies. Tenant teams create PVCs referencing these StorageClasses. The CSI drivers (deployed as DaemonSets + controller Deployments) handle the actual volume provisioning in AWS or GCP.

Storage Architecture in the Banking Platform

Persistent Volumes (PV), Persistent Volume Claims (PVC), StorageClasses

Section titled “Persistent Volumes (PV), Persistent Volume Claims (PVC), StorageClasses”

StorageClass, PersistentVolume, and PersistentVolumeClaim

Dynamic Provisioning (the standard approach)

Section titled “Dynamic Provisioning (the standard approach)”

This is how 99% of enterprise storage works in Kubernetes:

Dynamic Provisioning Flow

Admin manually creates a PV pointing to an existing volume, then creates a PVC that binds to it. Used for:

  • Pre-existing volumes with data (migration scenarios)
  • Volumes that must not be auto-deleted
  • Cross-account volume sharing (rare)
ModeShortDescriptionEBS/PDEFS/Filestore
ReadWriteOnceRWOSingle node read/writeYesYes
ReadOnlyManyROXMultiple nodes read-onlyVia snapshotYes
ReadWriteManyRWXMultiple nodes read/writeNoYes
ReadWriteOncePodRWOPSingle pod read/write (K8s 1.27+)YesYes
ModeBehaviorWhen to Use
ImmediatePV created as soon as PVC is createdWhen AZ does not matter (rare)
WaitForFirstConsumerPV created when a pod using the PVC is scheduledAlways use this for block storage
# WHY WaitForFirstConsumer matters:
#
# Scenario: 3-AZ cluster, PVC with Immediate binding
# PVC created → PV created in AZ-a (random)
# Pod scheduled to AZ-b (best fit for resources)
# PROBLEM: EBS volume in AZ-a, pod in AZ-b → cannot attach!
#
# Solution: WaitForFirstConsumer
# PVC created → stays Pending
# Pod scheduled to AZ-b
# PV created in AZ-b (same AZ as pod) → attaches successfully
PolicyWhat Happens When PVC is DeletedUse Case
Delete (default for dynamic)PV and underlying cloud volume are deletedDev/test, ephemeral data
RetainPV becomes “Released”, cloud volume keptProduction databases, audit data

For banking, use Retain for all production data. Even after a PVC is deleted, the underlying EBS/PD volume remains for recovery.


The primary block storage driver for EKS. Deployed as an EKS managed add-on.

EBS volume types for Kubernetes:

TypeIOPSThroughputUse CaseCost
gp33,000 (free) up to 16,000125 MiB/s (free) up to 1,000 MiB/sGeneral purpose, most workloadsLowest
io2Up to 64,000 (provisioned)Up to 1,000 MiB/sDatabases (PostgreSQL, MongoDB)Highest
io2 Block ExpressUp to 256,000Up to 4,000 MiB/sExtreme IOPS (SAP HANA)Very high
st1Throughput-optimizedUp to 500 MiB/sLog processing, data warehousingLow

For shared storage (ReadWriteMany). Essential when multiple pods across nodes need to read/write the same files.

When to use EBS vs EFS:

DimensionEBS (block)EFS (file)
Access modeRWO / RWOPRWX / ROX / RWO
PerformanceHigh IOPS, low latencyLower IOPS, higher latency
AZ scopeSingle AZMulti-AZ
Use caseDatabases, Kafka, single-pod workloadsShared config, CMS uploads, ML training data
CostPer GB provisionedPer GB used (+ throughput)
BackupEBS SnapshotsEFS Backup (AWS Backup)

The primary block storage driver for GKE. Built into GKE (no manual installation needed).

GCP Persistent Disk types:

TypeIOPS (read)IOPS (write)ThroughputUse CaseCost
pd-standard0.75/GiB1.5/GiB120 MiB/sDev/test, logsLowest
pd-balanced6/GiB6/GiB240 MiB/sGeneral purposeMedium
pd-ssd30/GiB30/GiB480 MiB/sDatabases, KafkaHigher
pd-extremeUp to 120KUp to 120KUp to 2,400 MiB/sSAP HANA, OracleHighest
hyperdisk-extremeUp to 350KUp to 350KUp to 5,000 MiB/sExtreme performanceVery high

Regional Persistent Disks:

GCP offers regional PDs that replicate data across two zones. This is a major differentiator from AWS EBS (which is single-AZ only).

GCP’s managed NFS service. Equivalent to AWS EFS.

Filestore tiers:

TierMin SizePerformanceUse Case
Basic HDD1 TiBLow IOPSArchival, low-access shared data
Basic SSD2.5 TiBHigh IOPSGeneral shared storage
Zonal1 TiBConfigurable IOPS/throughputFlexible, new tier
Enterprise1 TiBHighest IOPS, regional replicationMission-critical shared data

Mount Google Cloud Storage buckets as file systems in pods. Useful for large datasets (ML training, data analytics).

# StorageClass — gp3 with encryption (standard for banking)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3-encrypted
provisioner: ebs.csi.aws.com
parameters:
type: gp3
encrypted: "true"
kmsKeyId: arn:aws:kms:me-south-1:123456789012:key/mrk-abc123 # CMEK
fsType: ext4
iops: "3000" # gp3 baseline (free up to 3000)
throughput: "125" # gp3 baseline (free up to 125 MiB/s)
reclaimPolicy: Retain # keep volume after PVC deletion
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true # allow PVC resize
---
# StorageClass — io2 for high-IOPS databases
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: io2-database
provisioner: ebs.csi.aws.com
parameters:
type: io2
encrypted: "true"
kmsKeyId: arn:aws:kms:me-south-1:123456789012:key/mrk-abc123
fsType: ext4
iops: "10000" # provisioned IOPS
iopsPerGB: "50" # alternative: scale with volume size
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
# EBS CSI Driver as EKS managed add-on
resource "aws_eks_addon" "ebs_csi" {
cluster_name = module.eks.cluster_name
addon_name = "aws-ebs-csi-driver"
addon_version = "v1.37.0-eksbuild.1"
service_account_role_arn = module.ebs_csi_irsa.iam_role_arn
resolve_conflicts_on_update = "OVERWRITE"
}
# IAM role for EBS CSI driver (Pod Identity or IRSA)
module "ebs_csi_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "~> 5.0"
role_name = "ebs-csi-driver-${module.eks.cluster_name}"
attach_ebs_csi_policy = true
# Allow encryption with custom KMS key
ebs_csi_kms_cmk_ids = [aws_kms_key.ebs_encryption.arn]
oidc_providers = {
main = {
provider_arn = module.eks.oidc_provider_arn
namespace_service_accounts = ["kube-system:ebs-csi-controller-sa"]
}
}
}
# KMS key for EBS encryption
resource "aws_kms_key" "ebs_encryption" {
description = "KMS key for EBS volume encryption in EKS"
deletion_window_in_days = 30
enable_key_rotation = true
policy = data.aws_iam_policy_document.ebs_kms.json
}

Encryption at Rest — Enterprise Requirement

Section titled “Encryption at Rest — Enterprise Requirement”

In banking, ALL persistent volumes must be encrypted. No exceptions.

EBS Encryption Architecture with KMS

Best practice: account-level default encryption. This ensures that even if someone creates a StorageClass without encrypted: "true", the volume is still encrypted.

Persistent Disk Encryption with CMEK (GCP)

Section titled “Persistent Disk Encryption with CMEK (GCP)”

GCP PD Encryption Architecture

# Force ALL EBS volumes in the account to be encrypted
resource "aws_ebs_encryption_by_default" "enabled" {
enabled = true
}
resource "aws_ebs_default_kms_key" "default" {
key_arn = aws_kms_key.ebs_encryption.arn
}

Volume snapshots allow point-in-time backups of persistent volumes, stored as cloud snapshots (EBS Snapshots / PD Snapshots).

# AWS — VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: ebs-snapshot-class
driver: ebs.csi.aws.com
deletionPolicy: Retain # keep snapshot even if VolumeSnapshot object deleted
parameters:
encrypted: "true"
kmsKeyId: arn:aws:kms:me-south-1:123456789012:key/mrk-abc123
---
# GCP — VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: pd-snapshot-class
driver: pd.csi.storage.gke.io
deletionPolicy: Retain
parameters:
storage-locations: me-central1
# Take a snapshot of a PVC
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: kafka-data-snapshot-2026-03-15
namespace: fraud-detection
spec:
volumeSnapshotClassName: ebs-snapshot-class # or pd-snapshot-class
source:
persistentVolumeClaimName: data-kafka-0 # PVC to snapshot
# Create a new PVC from a snapshot
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-kafka-0-restored
namespace: fraud-detection
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp3-encrypted
resources:
requests:
storage: 100Gi
dataSource:
name: kafka-data-snapshot-2026-03-15
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
# CronJob to snapshot Kafka data nightly
apiVersion: batch/v1
kind: CronJob
metadata:
name: kafka-snapshot-backup
namespace: fraud-detection
spec:
schedule: "0 3 * * *" # 03:00 UTC daily
timeZone: "Asia/Dubai"
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
spec:
serviceAccountName: snapshot-manager
restartPolicy: Never
containers:
- name: snapshot
image: bitnami/kubectl:1.31
command:
- /bin/sh
- -c
- |
DATE=$(date +%Y-%m-%d)
for i in 0 1 2; do
cat <<SNAP | kubectl apply -f -
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: kafka-data-${i}-${DATE}
namespace: fraud-detection
spec:
volumeSnapshotClassName: ebs-snapshot-class
source:
persistentVolumeClaimName: data-kafka-${i}
SNAP
done
# Clean up snapshots older than 7 days
kubectl get volumesnapshot -n fraud-detection \
--sort-by=.metadata.creationTimestamp \
-o name | head -n -21 | xargs -r kubectl delete -n fraud-detection

Create a new PVC from an existing PVC (no snapshot needed). Useful for creating test environments from production data.

# Clone a PVC (same StorageClass, same AZ)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: payments-db-clone
namespace: payments-staging
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp3-encrypted
resources:
requests:
storage: 100Gi
dataSource:
name: payments-db-data # source PVC
kind: PersistentVolumeClaim

Grow a PVC without downtime (for file systems that support online resize).

# StorageClass must have allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3-encrypted
provisioner: ebs.csi.aws.com
allowVolumeExpansion: true # THIS enables resize
# ...
Terminal window
# 1. Edit the PVC to increase size
kubectl patch pvc data-kafka-0 -n fraud-detection \
-p '{"spec": {"resources": {"requests": {"storage": "200Gi"}}}}'
# 2. Check PVC conditions
kubectl get pvc data-kafka-0 -n fraud-detection -o yaml
# Look for condition: FileSystemResizePending
# 3. Volume is resized in cloud (EBS ModifyVolume / PD resize)
# File system resize happens automatically when pod restarts
# (or immediately with online resize support)
# 4. Verify
kubectl get pvc data-kafka-0 -n fraud-detection
# CAPACITY should show 200Gi

Resize flow:

PVC Resize Flow


Volumes that live and die with the pod. No PVC needed.

TypeDescriptionUse Case
emptyDirEmpty directory created when pod starts, deleted when pod diesScratch space, inter-container data sharing
emptyDir (memory)tmpfs-backed emptyDirSensitive data that must not touch disk
configMapMount ConfigMap as filesApplication config files
secretMount Secret as filesTLS certificates, credentials
projectedCombine multiple volume sources into one mountServiceAccount token + ConfigMap + Secret
downwardAPIExpose pod metadata as filesPod name, namespace, labels
spec:
containers:
- name: app
volumeMounts:
- name: scratch
mountPath: /tmp/processing
- name: sidecar
volumeMounts:
- name: scratch
mountPath: /data # same volume, different mount path
volumes:
- name: scratch
emptyDir:
sizeLimit: 5Gi # evict pod if exceeded
---
# Memory-backed emptyDir (for secrets/sensitive processing)
volumes:
- name: sensitive-scratch
emptyDir:
medium: Memory # tmpfs — never written to disk
sizeLimit: 256Mi # counts against pod memory limit
spec:
containers:
- name: app
volumeMounts:
- name: all-config
mountPath: /etc/app
readOnly: true
volumes:
- name: all-config
projected:
sources:
- configMap:
name: app-config
items:
- key: config.yaml
path: config.yaml
- secret:
name: app-tls
items:
- key: tls.crt
path: tls/cert.pem
- key: tls.key
path: tls/key.pem
- serviceAccountToken:
path: token
expirationSeconds: 3600
audience: vault

The platform team limits how much storage each tenant namespace can consume:

apiVersion: v1
kind: ResourceQuota
metadata:
name: storage-quota
namespace: payments
spec:
hard:
requests.storage: 500Gi # total PVC size
persistentvolumeclaims: "20" # max number of PVCs
gp3-encrypted.storageclass.storage.k8s.io/requests.storage: 300Gi # per StorageClass
io2-database.storageclass.storage.k8s.io/requests.storage: 200Gi

Set a default StorageClass so PVCs without an explicit class get the right storage:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3-encrypted
annotations:
storageclass.kubernetes.io/is-default-class: "true" # default
provisioner: ebs.csi.aws.com
parameters:
type: gp3
encrypted: "true"
kmsKeyId: arn:aws:kms:me-south-1:123456789012:key/mrk-abc123
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

For multi-AZ clusters with block storage, ensure pods and volumes are co-located:

# StatefulSet with topology spread + storage affinity
spec:
template:
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: kafka
# WaitForFirstConsumer ensures PV is created in the same AZ as the pod
# Pod anti-affinity ensures one pod per AZ
# Result: each Kafka broker in a different AZ, with its PV in the same AZ

Scenario 1: “Your application needs shared storage across multiple pods. Options on EKS vs GKE?”

Section titled “Scenario 1: “Your application needs shared storage across multiple pods. Options on EKS vs GKE?””

Answer:

EKS shared storage options:

OptionAccess ModePerformanceCostUse Case
EFSRWXModerate (ms latency)Per-GB used + throughputShared config, uploads, CMS
FSx for LustreRWXVery high (sub-ms)Per-GB provisionedHPC, ML training
S3 via MountpointROX/RWX (append)High throughput, high latencyPer-GB + requestsData lake, archives

“For a banking application needing shared read-write storage across pods, I would use EFS with the EFS CSI driver. It provides ReadWriteMany access across all AZs, automatic scaling, and encryption with KMS. I would use EFS Access Points to isolate different tenants. For ML training data that is read-heavy, I might use S3 with Mountpoint for S3 CSI driver instead — cheaper and higher throughput for large sequential reads.”

GKE shared storage options:

OptionAccess ModePerformanceCostUse Case
FilestoreRWXHigh (NFS)Per-GB provisionedShared config, uploads
Filestore EnterpriseRWXHighest, regional replicationPremiumMission-critical shared data
GCS FUSEROX/RWX (object)High throughput, eventual consistencyPer-GB used + opsML data, archives, static assets

“On GKE, I would use Filestore for traditional shared file storage. For ML training data, GCS FUSE is more cost-effective and integrates well with BigQuery and Vertex AI. Filestore Enterprise provides regional replication for HA — important for banking workloads that cannot tolerate zone failures.”


Scenario 2: “Design storage for a Kafka cluster on Kubernetes”

Section titled “Scenario 2: “Design storage for a Kafka cluster on Kubernetes””

Answer:

Kafka Storage Architecture on Kubernetes Design decisions:

DecisionEKS ChoiceGKE ChoiceWhy
Volume typeio2 (10,000 IOPS)pd-ssd (30 IOPS/GiB = 15,000 at 500Gi)Kafka needs high IOPS
Size500Gi per broker500Gi per broker7-day retention, ~50 topics
EncryptionKMS CMKCMEKBanking requirement
ReplicationKafka replication factor=3Kafka replication factor=3Data redundancy at app level
PD replicationN/A (EBS is single-AZ)Not needed (Kafka handles it)Do not pay for both
Binding modeWaitForFirstConsumerWaitForFirstConsumerEnsure PV in same AZ as pod
Reclaim policyRetainRetainNever auto-delete Kafka data

“I would NOT use regional PDs for Kafka on GKE. Kafka already replicates data across brokers (replication factor=3). Paying for regional PD replication on top of Kafka replication is wasteful. The PD just needs to be fast and reliable within a single zone. If a zone fails, Kafka’s built-in replication handles it.”


Scenario 3: “A PVC is stuck in Pending. How do you debug?”

Section titled “Scenario 3: “A PVC is stuck in Pending. How do you debug?””

Answer:

PVC Pending — Debugging Decision Tree Quick debugging commands:

Terminal window
# Check PVC status and events
kubectl describe pvc <name> -n <namespace>
# Check if StorageClass exists
kubectl get sc
# Check CSI driver pods are running
kubectl get pods -n kube-system -l app=ebs-csi-controller # EKS
kubectl get pods -n kube-system -l app=gke-pd-csi-driver # GKE
# Check CSI driver logs
kubectl logs -n kube-system -l app=ebs-csi-controller -c csi-provisioner
# Check ResourceQuota
kubectl get resourcequota -n <namespace> -o yaml
# Check PV to see if any are available
kubectl get pv --sort-by=.status.phase
# Check node volume attachment count
kubectl get csinodes -o yaml

Scenario 4: “How do you migrate data from one StorageClass to another without downtime?”

Section titled “Scenario 4: “How do you migrate data from one StorageClass to another without downtime?””

Answer:

StorageClass Migration Strategy

“For a single database volume, I would use the snapshot approach — take a VolumeSnapshot, create a new PVC from the snapshot with the new StorageClass, stop the database briefly, update the StatefulSet to reference the new PVC, and start it. Downtime is minimal (just the restart time).

For a distributed system like Kafka, I would use application-level migration — add new brokers with the new StorageClass, let Kafka rebalance partitions to the new brokers, then decommission the old ones. Zero downtime.”


Scenario 5: “Explain the tradeoffs between EBS/PD (block) vs EFS/Filestore (file) for Kubernetes workloads”

Section titled “Scenario 5: “Explain the tradeoffs between EBS/PD (block) vs EFS/Filestore (file) for Kubernetes workloads””

Answer:

DimensionBlock (EBS/PD)File (EFS/Filestore)
AccessSingle pod (RWO/RWOP)Multiple pods (RWX)
PerformanceHigh IOPS, low latency (<1ms)Moderate IOPS, higher latency (2-5ms)
AZ scopeSingle AZ (EBS) / optional regional (PD)Multi-AZ by default
ScalingFixed size (must resize explicitly)Auto-scales (EFS) or fixed (Filestore)
CostPer-GB provisioned (predictable)Per-GB used (EFS) or provisioned (Filestore)
BackupVolume snapshots (incremental)AWS Backup / GCP Backup
POSIX complianceFull (it is a real block device with ext4/xfs)Full (NFS)
ConsistencyStrong (single writer)NFS semantics (close-to-open)
Best forDatabases, Kafka, any single-pod stateful workloadShared config, CMS uploads, ML training data, WordPress

“Use block storage when you need raw performance and a single pod owns the data — databases, message brokers, caches. Use file storage when multiple pods need to read and write the same data — shared configuration, user-uploaded files, ML training datasets.

For banking, I would use gp3/io2 (EBS) or pd-ssd (PD) for all databases and stateful services, and EFS or Filestore only for shared file storage like document processing pipelines. I would avoid EFS for high-IOPS workloads because the latency is noticeably higher than EBS.”


  1. Always say WaitForFirstConsumer. If you are designing any StorageClass with block storage, mention this binding mode. It prevents the AZ mismatch problem, which is the single most common storage issue on Kubernetes.

  2. Encryption is not optional. For banking interviews, every StorageClass must have encryption with CMEK (customer-managed keys). Know the difference: AWS has encrypted: "true" + kmsKeyId; GCP has disk-encryption-kms-key.

  3. Know the volume limits per node. EBS has a per-instance attachment limit (typically 25-28 volumes). GCP PD supports up to 128 per node. This matters when running many StatefulSets on the same node.

  4. Understand when NOT to use storage replication. Kafka and other distributed systems already replicate at the application level. Adding regional PDs or EBS multi-AZ on top is wasteful. Match the redundancy mechanism to the application architecture.

  5. PVC Pending is the most common storage issue. Know the debugging tree: missing StorageClass, wrong AZ (Immediate binding), quota exceeded, CSI driver permissions, volume attachment limits. Walk through this methodically in interviews.