TL;DR
We got rook running IPv6 only and are using a combination of the helm operator chart in combination with custom CephCluster and CephBlockPool objects.
v1: original rook manifests
git clone https://github.com/rook/rook.git
cd rook/cluster/examples/kubernetes/ceph
kubectl apply -f crds.yaml -f common.yaml
kubectl apply -f operator.yaml
kubectl get -n rook-ceph pods --watch
kubectl apply -f cluster.yaml
kubectl apply -f csi/rbd/storageclass.yaml
kubectl apply -f toolbox.yaml
v2 with included manifests
- Patched for IPv6 support
- Including RBD support
- Including CephFS support
for yaml in crds common operator cluster storageclass-cephfs storageclass-rbd toolbox; do
kubectl apply -f ${yaml}.yaml
done
Deleting (in case of teardown):
for yaml in crds common operator cluster storageclass-cephfs storageclass-rbd toolbox; do
kubectl delete -f ${yaml}.yaml
done
v3 via helm
helm repo add rook-release https://charts.rook.io/release
helm repo update
helm install --create-namespace --namespace rook-ceph rook-ceph rook-release/rook-ceph
helm install --create-namespace --namespace rook-ceph rook-ceph-cluster \
--set operatorNamespace=rook-ceph rook-release/rook-ceph-cluster -f rook/values.yaml
Debugging / ceph toolbox
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
Creating a sample RBD device / PVC
kubectl apply -f pvc.yaml
Checks:
kubectl get pvc
kubectl describe pvc
kubectl get pv
kubectl describe pv
Digging into ceph, seeing the actual image:
[20:05] server47.place7:~# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- rbd -p replicapool ls
csi-vol-d3c96f79-c7ba-11eb-8e52-1ed2f2d63451
[20:11] server47.place7:~#
Filesystem
[21:06] server47.place7:~/ungleich-k8s/rook# kubectl -n rook-ceph get pod -l app=rook-ceph-mds
NAME READY STATUS RESTARTS AGE
rook-ceph-mds-myfs-a-5f547fd7c6-qmp2r 1/1 Running 0 16s
rook-ceph-mds-myfs-b-dd78b444b-49h5h 0/1 PodInitializing 0 14s
[21:06] server47.place7:~/ungleich-k8s/rook# kubectl -n rook-ceph get pod -l app=rook-ceph-mds
NAME READY STATUS RESTARTS AGE
rook-ceph-mds-myfs-a-5f547fd7c6-qmp2r 1/1 Running 0 20s
rook-ceph-mds-myfs-b-dd78b444b-49h5h 1/1 Running 0 18s
[21:06] server47.place7:~/ungleich-k8s/rook# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph -s
cluster:
id: 049110d9-9368-4750-b3d3-6ca9a80553d7
health: HEALTH_WARN
mons are allowing insecure global_id reclaim
services:
mon: 3 daemons, quorum a,b,d (age 98m)
mgr: a(active, since 97m), standbys: b
mds: 1/1 daemons up, 1 hot standby
osd: 6 osds: 6 up (since 66m), 6 in (since 67m)
data:
volumes: 1/1 healthy
pools: 4 pools, 97 pgs
objects: 31 objects, 27 KiB
usage: 40 MiB used, 45 GiB / 45 GiB avail
pgs: 97 active+clean
io:
client: 3.3 KiB/s rd, 2.8 KiB/s wr, 2 op/s rd, 1 op/s wr
[21:07] server47.place7:~/ungleich-k8s/rook#
DefaultStorageClass
By default none of the created storage classes are the "default" of the cluster. So we need to set one of them, if persistentvolumeclaims should be deployed:
[21:22] server47.place7:~/ungleich-k8s/rook# kubectl patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
Deleting in case rook gets stuck
- Need to manually go through the list, the patching of finalizersdoes not work reliable
Especially these:
finalizers:
- cephblockpool.ceph.rook.io
Other flux related problems
- The host is not cleared / old /var/lib/rook is persisting
Troubleshooting: PVC stuck pending, no csi-{cephfs,rbd}provisioner-plugin pod in rook-ceph namespace
2021-07-31: it seems that the provisioner plugin tend to silently die.
Restarting the rook-ceph-operator
deployment will get them back up:
kubectl rollout restart deployment/rook-ceph-operator -n rook-ceph