Additional details for rook migration
This commit is contained in:
parent
0c749903b0
commit
bcc155c1fd
1 changed files with 125 additions and 0 deletions
|
@ -334,6 +334,131 @@ So today we start with
|
|||
Using BGP and calico, the kubernetes cluster is setup "as usual" (for
|
||||
ungleich terms).
|
||||
|
||||
### Ceph.conf change
|
||||
|
||||
Originally our ceph.conf contained:
|
||||
|
||||
```
|
||||
public network = 2a0a:e5c0:0:0::/64
|
||||
cluster network = 2a0a:e5c0:0:0::/64
|
||||
```
|
||||
|
||||
As of today they are removed and all daemons are restarted, allowing
|
||||
the native cluster to speak with the kubernetes cluster.
|
||||
|
||||
### Setting up rook
|
||||
|
||||
Usually we deploy rook via argocd. However as we want to be easily
|
||||
able to do manual intervention, we will first bootstrap rook via helm
|
||||
directly and turn off various services
|
||||
|
||||
```
|
||||
helm repo add rook https://charts.rook.io/release
|
||||
helm repo update
|
||||
```
|
||||
|
||||
We will use rook 1.8, as it is the last version to support Ceph
|
||||
nautilus, which is our current ceph version. The latest 1.8 version is
|
||||
1.8.10 at the moment.
|
||||
|
||||
```
|
||||
helm upgrade --install --namespace rook-ceph --create-namespace --version v1.8.10 rook-ceph rook/rook-ceph
|
||||
```
|
||||
|
||||
### Joining the 2 clusters, step 1: monitors and managers
|
||||
|
||||
In the first step we want to add rook based monitors and managers
|
||||
and replace the native ones. For rook to be able to talk to our
|
||||
existing cluster, it needs to know
|
||||
|
||||
* the current monitors/managers ("the monmap")
|
||||
* the right keys to talk to the existing cluster
|
||||
* the fsid
|
||||
|
||||
As we are using v1.8, we will follow
|
||||
[the guidelines for disaster recover of rook
|
||||
1.8](https://www.rook.io/docs/rook/v1.8/ceph-disaster-recovery.html).
|
||||
|
||||
Later we will need to create all the configurations so that rook knows
|
||||
about the different pools.
|
||||
|
||||
### Rook: CephCluster
|
||||
|
||||
Rook has a configuration of type `CephCluster` that typically looks
|
||||
something like this:
|
||||
|
||||
```
|
||||
apiVersion: ceph.rook.io/v1
|
||||
kind: CephCluster
|
||||
metadata:
|
||||
name: rook-ceph
|
||||
namespace: rook-ceph
|
||||
spec:
|
||||
cephVersion:
|
||||
# see the "Cluster Settings" section below for more details on which image of ceph to run
|
||||
image: quay.io/ceph/ceph:{{ .Chart.AppVersion }}
|
||||
dataDirHostPath: /var/lib/rook
|
||||
mon:
|
||||
count: 5
|
||||
allowMultiplePerNode: false
|
||||
storage:
|
||||
useAllNodes: true
|
||||
useAllDevices: true
|
||||
onlyApplyOSDPlacement: false
|
||||
mgr:
|
||||
count: 1
|
||||
modules:
|
||||
- name: pg_autoscaler
|
||||
enabled: true
|
||||
network:
|
||||
ipFamily: "IPv6"
|
||||
dualStack: false
|
||||
crashCollector:
|
||||
disable: false
|
||||
# Uncomment daysToRetain to prune ceph crash entries older than the
|
||||
# specified number of days.
|
||||
daysToRetain: 30
|
||||
```
|
||||
|
||||
For migrating, we don't want rook in the first stage to create any
|
||||
OSDs. So we will replace `useAllNodes: true` with `useAllNodes: false`
|
||||
and `useAllDevices: true` also with `useAllDevices: false`.
|
||||
|
||||
### Extracting a monmap
|
||||
|
||||
To get access to the existing monmap, we can export it from the native
|
||||
cluster using `ceph-mon -i {mon-id} --extract-monmap {map-path}`.
|
||||
More details can be found on the [documentation for adding and
|
||||
removing ceph
|
||||
monitors](https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/).
|
||||
|
||||
|
||||
|
||||
### Rook and Ceph pools
|
||||
|
||||
Rook uses `CephBlockPool` to describe ceph pools as follows:
|
||||
|
||||
```
|
||||
apiVersion: ceph.rook.io/v1
|
||||
kind: CephBlockPool
|
||||
metadata:
|
||||
name: hdd
|
||||
namespace: rook-ceph
|
||||
spec:
|
||||
failureDomain: host
|
||||
replicated:
|
||||
size: 3
|
||||
deviceClass: hdd
|
||||
```
|
||||
|
||||
In this particular cluster we have 2 pools:
|
||||
|
||||
- one (ssd based, device class = ssd)
|
||||
- hdd (hdd based, device class = hdd-big)
|
||||
|
||||
The device class "hdd-big" is specific to this cluster as it used to
|
||||
contain 2.5" and 3.5" HDDs in different pools.
|
||||
|
||||
|
||||
|
||||
## Changelog
|
||||
|
|
Loading…
Reference in a new issue