Block Devices and Kubernetes

You may use Ceph Block Device images with Kubernetes v1.13 and later through ceph-csi, which dynamically provisions RBD images to back Kubernetes volumes and maps these RBD images as block devices (optionally mounting a file system contained within the image) on worker nodes running pods that reference an RBD-backed volume. Ceph stripes block device images as objects across the cluster, which means that large Ceph Block Device images have better performance than a standalone server!

To use Ceph Block Devices with Kubernetes v1.13 and higher, you must install and configure ceph-csi within your Kubernetes environment. The following diagram depicts the Kubernetes/Ceph technology stack.

Important

ceph-csi uses the RBD kernel modules by default which may not support all Ceph CRUSH tunables or RBD image features.

Create a Pool

By default, Ceph block devices use the rbd pool. Create a pool for Kubernetes volume storage. Ensure your Ceph cluster is running, then create the pool.

  1. $ ceph osd pool create kubernetes

See Create a Pool for details on specifying the number of placement groups for your pools, and Placement Groups for details on the number of placement groups you should set for your pools.

A newly created pool must be initialized prior to use. Use the rbd tool to initialize the pool:

  1. $ rbd pool init kubernetes

Configure ceph-csi

Setup Ceph Client Authentication

Create a new user for Kubernetes and ceph-csi. Execute the following and record the generated key:

  1. $ ceph auth get-or-create client.kubernetes mon 'profile rbd' osd 'profile rbd pool=kubernetes' mgr 'profile rbd pool=kubernetes'
  2. [client.kubernetes]
  3. key = AQD9o0Fd6hQRChAAt7fMaSZXduT3NWEqylNpmg==

Generate ceph-csi ConfigMap

The ceph-csi requires a ConfigMap object stored in Kubernetes to define the the Ceph monitor addresses for the Ceph cluster. Collect both the Ceph cluster unique fsid and the monitor addresses:

  1. $ ceph mon dump
  2. <...>
  3. fsid b9127830-b0cc-4e34-aa47-9d1a2e9949a8
  4. <...>
  5. 0: [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] mon.a
  6. 1: [v2:192.168.1.2:3300/0,v1:192.168.1.2:6789/0] mon.b
  7. 2: [v2:192.168.1.3:3300/0,v1:192.168.1.3:6789/0] mon.c

Note

ceph-csi currently only supports the legacy V1 protocol.

Generate a csi-config-map.yaml file similar to the example below, substituting the fsid for “clusterID”, and the monitor addresses for “monitors”:

  1. $ cat <<EOF > csi-config-map.yaml
  2. ---
  3. apiVersion: v1
  4. kind: ConfigMap
  5. data:
  6. config.json: |-
  7. [
  8. {
  9. "clusterID": "b9127830-b0cc-4e34-aa47-9d1a2e9949a8",
  10. "monitors": [
  11. "192.168.1.1:6789",
  12. "192.168.1.2:6789",
  13. "192.168.1.3:6789"
  14. ]
  15. }
  16. ]
  17. metadata:
  18. name: ceph-csi-config
  19. EOF

Once generated, store the new ConfigMap object in Kubernetes:

  1. $ kubectl apply -f csi-config-map.yaml

Generate ceph-csi cephx Secret

ceph-csi requires the cephx credentials for communicating with the Ceph cluster. Generate a csi-rbd-secret.yaml file similar to the example below, using the newly created Kubernetes user id and cephx key:

  1. $ cat <<EOF > csi-rbd-secret.yaml
  2. ---
  3. apiVersion: v1
  4. kind: Secret
  5. metadata:
  6. name: csi-rbd-secret
  7. namespace: default
  8. stringData:
  9. userID: kubernetes
  10. userKey: AQD9o0Fd6hQRChAAt7fMaSZXduT3NWEqylNpmg==
  11. EOF

Once generated, store the new Secret object in Kubernetes:

  1. $ kubectl apply -f csi-rbd-secret.yaml

Configure ceph-csi Plugins

Create the required ServiceAccount and RBAC ClusterRole/ClusterRoleBinding Kubernetes objects. These objects do not necessarily need to be customized for your Kubernetes environment and therefore can be used as-is from the ceph-csi deployment YAMLs:

  1. $ kubectl apply -f https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/rbd/kubernetes/csi-provisioner-rbac.yaml
  2. $ kubectl apply -f https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/rbd/kubernetes/csi-nodeplugin-rbac.yaml

Finally, create the ceph-csi provisioner and node plugins. With the possible exception of the ceph-csi container release version, these objects do not necessarily need to be customized for your Kubernetes environment and therefore can be used as-is from the ceph-csi deployment YAMLs:

  1. $ wget https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml
  2. $ kubectl apply -f csi-rbdplugin-provisioner.yaml
  3. $ wget https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/rbd/kubernetes/csi-rbdplugin.yaml
  4. $ kubectl apply -f csi-rbdplugin.yaml

Important

The provisioner and node plugin YAMLs will, by default, pull the development release of the ceph-csi container (quay.io/cephcsi/cephcsi:canary). The YAMLs should be updated to use a release version container for production workloads.

Using Ceph Block Devices

Create a StorageClass

The Kubernetes StorageClass defines a class of storage. Multiple StorageClass objects can be created to map to different quality-of-service levels (i.e. NVMe vs HDD-based pools) and features.

For example, to create a ceph-csi StorageClass that maps to the kubernetes pool created above, the following YAML file can be used after ensuring that the “clusterID” property matches your Ceph cluster’s fsid:

  1. $ cat <<EOF > csi-rbd-sc.yaml
  2. ---
  3. apiVersion: storage.k8s.io/v1
  4. kind: StorageClass
  5. metadata:
  6. name: csi-rbd-sc
  7. provisioner: rbd.csi.ceph.com
  8. parameters:
  9. clusterID: b9127830-b0cc-4e34-aa47-9d1a2e9949a8
  10. pool: kubernetes
  11. csi.storage.k8s.io/provisioner-secret-name: csi-rbd-secret
  12. csi.storage.k8s.io/provisioner-secret-namespace: default
  13. csi.storage.k8s.io/node-stage-secret-name: csi-rbd-secret
  14. csi.storage.k8s.io/node-stage-secret-namespace: default
  15. reclaimPolicy: Delete
  16. mountOptions:
  17. - discard
  18. EOF
  19. $ kubectl apply -f csi-rbd-sc.yaml

Create a PersistentVolumeClaim

A PersistentVolumeClaim is a request for abstract storage resources by a user. The PersistentVolumeClaim would then be associated to a Pod resource to provision a PersistentVolume, which would be backed by a Ceph block image. An optional volumeMode can be included to select between a mounted file system (default) or raw block device-based volume.

Using ceph-csi, specifying Filesystem for volumeMode can support both ReadWriteOnce and ReadOnlyMany accessMode claims, and specifying Block for volumeMode can support ReadWriteOnce, ReadWriteMany, and ReadOnlyMany accessMode claims.

For example, to create a block-based PersistentVolumeClaim that utilizes the ceph-csi-based StorageClass created above, the following YAML can be used to request raw block storage from the csi-rbd-sc StorageClass:

  1. $ cat <<EOF > raw-block-pvc.yaml
  2. ---
  3. apiVersion: v1
  4. kind: PersistentVolumeClaim
  5. metadata:
  6. name: raw-block-pvc
  7. spec:
  8. accessModes:
  9. - ReadWriteOnce
  10. volumeMode: Block
  11. resources:
  12. requests:
  13. storage: 1Gi
  14. storageClassName: csi-rbd-sc
  15. EOF
  16. $ kubectl apply -f raw-block-pvc.yaml

The following demonstrates and example of binding the above PersistentVolumeClaim to a Pod resource as a raw block device:

  1. $ cat <<EOF > raw-block-pod.yaml
  2. ---
  3. apiVersion: v1
  4. kind: Pod
  5. metadata:
  6. name: pod-with-raw-block-volume
  7. spec:
  8. containers:
  9. - name: fc-container
  10. image: fedora:26
  11. command: ["/bin/sh", "-c"]
  12. args: ["tail -f /dev/null"]
  13. volumeDevices:
  14. - name: data
  15. devicePath: /dev/xvda
  16. volumes:
  17. - name: data
  18. persistentVolumeClaim:
  19. claimName: raw-block-pvc
  20. EOF
  21. $ kubectl apply -f raw-block-pod.yaml

To create a file-system-based PersistentVolumeClaim that utilizes the ceph-csi-based StorageClass created above, the following YAML can be used to request a mounted file system (backed by an RBD image) from the csi-rbd-sc StorageClass:

  1. $ cat <<EOF > pvc.yaml
  2. ---
  3. apiVersion: v1
  4. kind: PersistentVolumeClaim
  5. metadata:
  6. name: rbd-pvc
  7. spec:
  8. accessModes:
  9. - ReadWriteOnce
  10. volumeMode: Filesystem
  11. resources:
  12. requests:
  13. storage: 1Gi
  14. storageClassName: csi-rbd-sc
  15. EOF
  16. $ kubectl apply -f pvc.yaml

The following demonstrates and example of binding the above PersistentVolumeClaim to a Pod resource as a mounted file system:

  1. $ cat <<EOF > pod.yaml
  2. ---
  3. apiVersion: v1
  4. kind: Pod
  5. metadata:
  6. name: csi-rbd-demo-pod
  7. spec:
  8. containers:
  9. - name: web-server
  10. image: nginx
  11. volumeMounts:
  12. - name: mypvc
  13. mountPath: /var/lib/www/html
  14. volumes:
  15. - name: mypvc
  16. persistentVolumeClaim:
  17. claimName: rbd-pvc
  18. readOnly: false
  19. EOF
  20. $ kubectl apply -f pod.yaml