ungleich-k8s/archive/v3/rook/README.md

154 lines
4.2 KiB
Markdown
Raw Normal View History

2021-10-18 13:15:52 +00:00
## TL;DR
We got rook running IPv6 only and are using a combination of the
[helm operator
chart](https://rook.io/docs/rook/v1.7/helm-operator.html) in
combination with custom
[CephCluster](https://rook.io/docs/rook/v1.7/ceph-cluster-crd.html)
and [CephBlockPool](https://rook.io/docs/rook/v1.7/ceph-block.html)
objects.
2021-06-07 17:24:20 +00:00
## v1: original rook manifests
2021-06-04 16:33:52 +00:00
```
2021-10-18 13:26:34 +00:00
git clone https://github.com/rook/rook.git
2021-06-04 16:33:52 +00:00
cd rook/cluster/examples/kubernetes/ceph
kubectl apply -f crds.yaml -f common.yaml
kubectl apply -f operator.yaml
kubectl get -n rook-ceph pods --watch
2021-06-06 14:42:35 +00:00
kubectl apply -f cluster.yaml
kubectl apply -f csi/rbd/storageclass.yaml
kubectl apply -f toolbox.yaml
2021-06-04 16:33:52 +00:00
```
2021-06-07 17:24:20 +00:00
## v2 with included manifests
* Patched for IPv6 support
2021-06-14 18:14:59 +00:00
* Including RBD support
* Including CephFS support
2021-06-07 17:24:20 +00:00
```
2021-06-14 18:14:59 +00:00
for yaml in crds common operator cluster storageclass-cephfs storageclass-rbd toolbox; do
2021-06-07 17:24:20 +00:00
kubectl apply -f ${yaml}.yaml
done
```
2021-06-14 17:38:45 +00:00
Deleting (in case of teardown):
```
2021-06-14 18:14:59 +00:00
for yaml in crds common operator cluster storageclass-cephfs storageclass-rbd toolbox; do
2021-06-14 17:38:45 +00:00
kubectl delete -f ${yaml}.yaml
done
```
2021-08-05 16:55:05 +00:00
## v3 via helm
```
helm repo add rook-release https://charts.rook.io/release
helm repo update
helm install --create-namespace --namespace rook-ceph rook-ceph rook-release/rook-ceph
helm install --create-namespace --namespace rook-ceph rook-ceph-cluster \
--set operatorNamespace=rook-ceph rook-release/rook-ceph-cluster -f rook/values.yaml
2021-08-05 16:55:05 +00:00
```
2021-06-14 17:38:45 +00:00
2021-06-07 17:24:20 +00:00
## Debugging / ceph toolbox
```
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
```
2021-06-07 18:06:39 +00:00
## Creating a sample RBD device / PVC
```
kubectl apply -f pvc.yaml
```
2021-06-07 18:33:00 +00:00
Checks:
```
kubectl get pvc
kubectl describe pvc
kubectl get pv
kubectl describe pv
```
Digging into ceph, seeing the actual image:
```
[20:05] server47.place7:~# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- rbd -p replicapool ls
csi-vol-d3c96f79-c7ba-11eb-8e52-1ed2f2d63451
[20:11] server47.place7:~#
```
2021-06-07 19:10:35 +00:00
## Filesystem
```
[21:06] server47.place7:~/ungleich-k8s/rook# kubectl -n rook-ceph get pod -l app=rook-ceph-mds
NAME READY STATUS RESTARTS AGE
rook-ceph-mds-myfs-a-5f547fd7c6-qmp2r 1/1 Running 0 16s
rook-ceph-mds-myfs-b-dd78b444b-49h5h 0/1 PodInitializing 0 14s
[21:06] server47.place7:~/ungleich-k8s/rook# kubectl -n rook-ceph get pod -l app=rook-ceph-mds
NAME READY STATUS RESTARTS AGE
rook-ceph-mds-myfs-a-5f547fd7c6-qmp2r 1/1 Running 0 20s
rook-ceph-mds-myfs-b-dd78b444b-49h5h 1/1 Running 0 18s
[21:06] server47.place7:~/ungleich-k8s/rook# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph -s
cluster:
id: 049110d9-9368-4750-b3d3-6ca9a80553d7
health: HEALTH_WARN
mons are allowing insecure global_id reclaim
services:
mon: 3 daemons, quorum a,b,d (age 98m)
mgr: a(active, since 97m), standbys: b
mds: 1/1 daemons up, 1 hot standby
osd: 6 osds: 6 up (since 66m), 6 in (since 67m)
data:
volumes: 1/1 healthy
pools: 4 pools, 97 pgs
objects: 31 objects, 27 KiB
usage: 40 MiB used, 45 GiB / 45 GiB avail
pgs: 97 active+clean
io:
client: 3.3 KiB/s rd, 2.8 KiB/s wr, 2 op/s rd, 1 op/s wr
[21:07] server47.place7:~/ungleich-k8s/rook#
```
2021-06-07 19:25:32 +00:00
## DefaultStorageClass
By default none of the created storage classes are the "default" of
the cluster. So we need to set one of them, if persistentvolumeclaims
should be deployed:
```
2021-06-15 23:20:59 +00:00
[21:22] server47.place7:~/ungleich-k8s/rook# kubectl patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
2021-06-07 19:25:32 +00:00
```
2021-07-11 18:37:21 +00:00
## Deleting in case rook gets stuck
* Need to manually go through the list, the patching of finalizersdoes
not work reliable
Especially these:
```
finalizers:
- cephblockpool.ceph.rook.io
```
2021-07-17 15:33:08 +00:00
## Other flux related problems
* The host is not cleared / old /var/lib/rook is persisting
## Troubleshooting: PVC stuck pending, no csi-{cephfs,rbd}provisioner-plugin pod in rook-ceph namespace
2021-07-31: it seems that the provisioner plugin tend to silently die.
Restarting the `rook-ceph-operator` deployment will get them back up:
```
kubectl rollout restart deployment/rook-ceph-operator -n rook-ceph
```