ungleich-k8s/archive/v3/rook
2021-10-18 15:26:58 +02:00
..
cluster.yaml Cleanup the ungleich-k8s repo 2021-10-18 15:15:52 +02:00
common.yaml Cleanup the ungleich-k8s repo 2021-10-18 15:15:52 +02:00
crds.yaml Cleanup the ungleich-k8s repo 2021-10-18 15:15:52 +02:00
object-bucket.yaml Cleanup the ungleich-k8s repo 2021-10-18 15:15:52 +02:00
operator.yaml Cleanup the ungleich-k8s repo 2021-10-18 15:15:52 +02:00
osd-purge.yaml Cleanup the ungleich-k8s repo 2021-10-18 15:15:52 +02:00
pvc-block-rwo.yaml Merge branch 'master' of code.ungleich.ch:ungleich-public/ungleich-k8s 2021-10-18 15:16:47 +02:00
pvc-cephfs-rwx.yaml Cleanup the ungleich-k8s repo 2021-10-18 15:15:52 +02:00
README.md Merge branch 'master' of code.ungleich.ch:ungleich-public/ungleich-k8s 2021-10-18 15:26:58 +02:00
storageclass-cephfs.yaml Cleanup the ungleich-k8s repo 2021-10-18 15:15:52 +02:00
storageclass-object-bucket.yaml Cleanup the ungleich-k8s repo 2021-10-18 15:15:52 +02:00
storageclass-object.yaml Cleanup the ungleich-k8s repo 2021-10-18 15:15:52 +02:00
storageclass-rbd.yaml Cleanup the ungleich-k8s repo 2021-10-18 15:15:52 +02:00
toolbox.yaml Cleanup the ungleich-k8s repo 2021-10-18 15:15:52 +02:00
values.yaml Cleanup the ungleich-k8s repo 2021-10-18 15:15:52 +02:00

TL;DR

We got rook running IPv6 only and are using a combination of the helm operator chart in combination with custom CephCluster and CephBlockPool objects.

v1: original rook manifests

git clone https://github.com/rook/rook.git
cd rook/cluster/examples/kubernetes/ceph
kubectl apply  -f crds.yaml -f common.yaml
kubectl apply  -f operator.yaml
kubectl get -n rook-ceph pods --watch
kubectl apply -f cluster.yaml
kubectl apply -f csi/rbd/storageclass.yaml
kubectl apply -f toolbox.yaml

v2 with included manifests

  • Patched for IPv6 support
  • Including RBD support
  • Including CephFS support
for yaml in crds common operator cluster storageclass-cephfs storageclass-rbd toolbox; do
    kubectl apply -f ${yaml}.yaml
done

Deleting (in case of teardown):

for yaml in crds common operator cluster storageclass-cephfs storageclass-rbd toolbox; do
    kubectl delete -f ${yaml}.yaml
done

v3 via helm

helm repo add rook-release https://charts.rook.io/release
helm repo update
helm install --create-namespace --namespace rook-ceph rook-ceph rook-release/rook-ceph
helm install --create-namespace --namespace rook-ceph rook-ceph-cluster \
   --set operatorNamespace=rook-ceph rook-release/rook-ceph-cluster -f rook/values.yaml

Debugging / ceph toolbox

kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash

Creating a sample RBD device / PVC

kubectl apply -f pvc.yaml

Checks:

kubectl get pvc
kubectl describe pvc

kubectl get pv
kubectl describe pv

Digging into ceph, seeing the actual image:

[20:05] server47.place7:~# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- rbd -p replicapool ls
csi-vol-d3c96f79-c7ba-11eb-8e52-1ed2f2d63451
[20:11] server47.place7:~#

Filesystem

[21:06] server47.place7:~/ungleich-k8s/rook# kubectl -n rook-ceph get pod -l app=rook-ceph-mds
NAME                                    READY   STATUS            RESTARTS   AGE
rook-ceph-mds-myfs-a-5f547fd7c6-qmp2r   1/1     Running           0          16s
rook-ceph-mds-myfs-b-dd78b444b-49h5h    0/1     PodInitializing   0          14s
[21:06] server47.place7:~/ungleich-k8s/rook# kubectl -n rook-ceph get pod -l app=rook-ceph-mds
NAME                                    READY   STATUS    RESTARTS   AGE
rook-ceph-mds-myfs-a-5f547fd7c6-qmp2r   1/1     Running   0          20s
rook-ceph-mds-myfs-b-dd78b444b-49h5h    1/1     Running   0          18s
[21:06] server47.place7:~/ungleich-k8s/rook# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph -s
  cluster:
    id:     049110d9-9368-4750-b3d3-6ca9a80553d7
    health: HEALTH_WARN
            mons are allowing insecure global_id reclaim

  services:
    mon: 3 daemons, quorum a,b,d (age 98m)
    mgr: a(active, since 97m), standbys: b
    mds: 1/1 daemons up, 1 hot standby
    osd: 6 osds: 6 up (since 66m), 6 in (since 67m)

  data:
    volumes: 1/1 healthy
    pools:   4 pools, 97 pgs
    objects: 31 objects, 27 KiB
    usage:   40 MiB used, 45 GiB / 45 GiB avail
    pgs:     97 active+clean

  io:
    client:   3.3 KiB/s rd, 2.8 KiB/s wr, 2 op/s rd, 1 op/s wr

[21:07] server47.place7:~/ungleich-k8s/rook#

DefaultStorageClass

By default none of the created storage classes are the "default" of the cluster. So we need to set one of them, if persistentvolumeclaims should be deployed:

[21:22] server47.place7:~/ungleich-k8s/rook# kubectl patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Deleting in case rook gets stuck

  • Need to manually go through the list, the patching of finalizersdoes not work reliable

Especially these:

  finalizers:
  - cephblockpool.ceph.rook.io
  • The host is not cleared / old /var/lib/rook is persisting

Troubleshooting: PVC stuck pending, no csi-{cephfs,rbd}provisioner-plugin pod in rook-ceph namespace

2021-07-31: it seems that the provisioner plugin tend to silently die. Restarting the rook-ceph-operator deployment will get them back up:

kubectl rollout restart deployment/rook-ceph-operator -n rook-ceph