ungleich-k8s/rook/README.md
2021-08-05 21:06:04 +02:00

143 lines
3.9 KiB
Markdown

## v1: original rook manifests
```
git clone https://github.com/rook/rook.git
cd rook/cluster/examples/kubernetes/ceph
kubectl apply -f crds.yaml -f common.yaml
kubectl apply -f operator.yaml
kubectl get -n rook-ceph pods --watch
kubectl apply -f cluster.yaml
kubectl apply -f csi/rbd/storageclass.yaml
kubectl apply -f toolbox.yaml
```
## v2 with included manifests
* Patched for IPv6 support
* Including RBD support
* Including CephFS support
```
for yaml in crds common operator cluster storageclass-cephfs storageclass-rbd toolbox; do
kubectl apply -f ${yaml}.yaml
done
```
Deleting (in case of teardown):
```
for yaml in crds common operator cluster storageclass-cephfs storageclass-rbd toolbox; do
kubectl delete -f ${yaml}.yaml
done
```
## v3 via helm
```
helm repo add rook-release https://charts.rook.io/release
helm repo update
helm install --create-namespace --namespace rook-ceph rook-ceph rook-release/rook-ceph
helm install --create-namespace --namespace rook-ceph rook-ceph-cluster \
--set operatorNamespace=rook-ceph rook-release/rook-ceph-cluster -f rook/values.yaml
```
## Debugging / ceph toolbox
```
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
```
## Creating a sample RBD device / PVC
```
kubectl apply -f pvc.yaml
```
Checks:
```
kubectl get pvc
kubectl describe pvc
kubectl get pv
kubectl describe pv
```
Digging into ceph, seeing the actual image:
```
[20:05] server47.place7:~# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- rbd -p replicapool ls
csi-vol-d3c96f79-c7ba-11eb-8e52-1ed2f2d63451
[20:11] server47.place7:~#
```
## Filesystem
```
[21:06] server47.place7:~/ungleich-k8s/rook# kubectl -n rook-ceph get pod -l app=rook-ceph-mds
NAME READY STATUS RESTARTS AGE
rook-ceph-mds-myfs-a-5f547fd7c6-qmp2r 1/1 Running 0 16s
rook-ceph-mds-myfs-b-dd78b444b-49h5h 0/1 PodInitializing 0 14s
[21:06] server47.place7:~/ungleich-k8s/rook# kubectl -n rook-ceph get pod -l app=rook-ceph-mds
NAME READY STATUS RESTARTS AGE
rook-ceph-mds-myfs-a-5f547fd7c6-qmp2r 1/1 Running 0 20s
rook-ceph-mds-myfs-b-dd78b444b-49h5h 1/1 Running 0 18s
[21:06] server47.place7:~/ungleich-k8s/rook# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph -s
cluster:
id: 049110d9-9368-4750-b3d3-6ca9a80553d7
health: HEALTH_WARN
mons are allowing insecure global_id reclaim
services:
mon: 3 daemons, quorum a,b,d (age 98m)
mgr: a(active, since 97m), standbys: b
mds: 1/1 daemons up, 1 hot standby
osd: 6 osds: 6 up (since 66m), 6 in (since 67m)
data:
volumes: 1/1 healthy
pools: 4 pools, 97 pgs
objects: 31 objects, 27 KiB
usage: 40 MiB used, 45 GiB / 45 GiB avail
pgs: 97 active+clean
io:
client: 3.3 KiB/s rd, 2.8 KiB/s wr, 2 op/s rd, 1 op/s wr
[21:07] server47.place7:~/ungleich-k8s/rook#
```
## DefaultStorageClass
By default none of the created storage classes are the "default" of
the cluster. So we need to set one of them, if persistentvolumeclaims
should be deployed:
```
[21:22] server47.place7:~/ungleich-k8s/rook# kubectl patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
```
## Deleting in case rook gets stuck
* Need to manually go through the list, the patching of finalizersdoes
not work reliable
Especially these:
```
finalizers:
- cephblockpool.ceph.rook.io
```
## Other flux related problems
* The host is not cleared / old /var/lib/rook is persisting
## Troubleshooting: PVC stuck pending, no csi-{cephfs,rbd}provisioner-plugin pod in rook-ceph namespace
2021-07-31: it seems that the provisioner plugin tend to silently die.
Restarting the `rook-ceph-operator` deployment will get them back up:
```
kubectl rollout restart deployment/rook-ceph-operator -n rook-ceph
```