ungleich-staticcms/content/u/blog/2022-08-27-migrating-ceph-nautilus-into-kubernetes-with-rook/contents.lr

title: [WIP] Migrating Ceph Nautilus into Kubernetes + Rook
---
pub_date: 2022-08-27
---
author: ungleich storage team
---
twitter_handle: ungleich
---
_hidden: no
---
_discoverable: yes
---
abstract:
How we move our Ceph clusters into kubernetes
---
body:

## Introduction

At ungleich we are running multiple Ceph clusters. Some of them are
running Ceph Nautilus (14.x) based on
[Devuan](https://www.devuan.org/). Our newer Ceph Pacific (16.x)
clusters are running based on [Rook](https://rook.io/) on
[Kubernetes](https://kubernetes.io/) on top of
[Alpine Linux](https://alpinelinux.org/).

In this blog article we will describe how to migrate
Ceph/Native/Devuan to Ceph/k8s+rook/Alpine Linux.

## Work in Progress [WIP]

This blog article is work in progress. The migration planning has
started, however the migration has not been finished yet. This article
will feature the different paths we take for the migration.

## The Plan

To continue operating the cluster during the migration, the following
steps are planned:

* Setup a k8s cluster that can potentially communicate with the
  existing ceph cluster
* Using the [disaster
  recovery](https://rook.io/docs/rook/v1.9/Troubleshooting/disaster-recovery/)
  guidelines from rook to modify the rook configuration to use the
  previous fsid.
* Spin up ceph monitors and ceph managers in rook
* Retire existing monitors
* Shutdown a ceph OSD node, remove it's OS disk, boot it with Alpine
  Linux
* Join the node into the k8s cluster
* Have rook pickup the existing disks and start the osds
* Repeat if successful
* Migrate to ceph pacific

### Original cluster

The target ceph cluster we want to migrate lives in the 2a0a:e5c0::/64
network. Ceph is using:

```
public network  = 2a0a:e5c0:0:0::/64
cluster network = 2a0a:e5c0:0:0::/64
```

### Kubernetes cluster networking inside the ceph network

To be able to communicate with the existing OSDs, we will be using
sub networks of 2a0a:e5c0::/64 for kubernetes. As these networks
are part of the link assigned network 2a0a:e5c0::/64, we will use BGP
routing on the existing ceph nodes to create more specific routes into
the kubernetes cluster.

As we plan to use either [cilium](https://cilium.io/) or
[calico](https://www.tigera.io/project-calico/) as the CNI, we can
configure kubernetes to directly BGP peer with the existing Ceph
nodes.

## The setup

### Kubernetes Bootstrap

As usual we bootstrap 3 control plane nodes using kubeadm. The proxy
for the API resides in a different kuberentes cluster.

We run

```
kubeadm init --config kubeadm.yaml
```

on the first node and join the other two control plane nodes. As
usual, joining the workers last.

### k8s Networking / CNI

For this setup we are using calico as described in the
[ungleich kubernetes
manual](https://redmine.ungleich.ch/projects/open-infrastructure/wiki/The_ungleich_kubernetes_infrastructure#section-23).

```
VERSION=v3.23.3
helm repo add projectcalico https://docs.projectcalico.org/charts
helm upgrade --install --namespace tigera calico projectcalico/tigera-operator --version $VERSION --create-namespace
```

### BGP Networking on the old nodes

To be able to import the BGP routes from Kubernetes, all old / native
hosts will run bird. The installation and configuration is as follows:

```
apt-get update
apt-get install -y bird2

router_id=$(hostname | sed 's/server//')

cat > /etc/bird/bird.conf <<EOF

router id $router_id;

log syslog all;
protocol device {
}
 # We are only interested in IPv6, skip another section for IPv4
protocol kernel {
        ipv6 { export all; };
}
protocol bgp k8s {
        local     as 65530;
        neighbor range 2a0a:e5c0::/64 as 65533;
        dynamic name "k8s_"; direct;

        ipv6 {
            import filter { if net.len > 64 then accept; else reject; };
            export none;
        };
}
EOF
/etc/init.d/bird restart

```

The router id must be adjusted for every host. As all hosts have a
unique number, we use that one as the router id.
The bird configuration allows to use dynamic peers so that any k8s
node in the network can peer with the old servers.

We also use a filter to avoid receiving /64 routes, as they are
overlapping with the on link route.

### BGP networking in Kubernetes

Calico supports BGP peering and we use a rather standard calico
configuration:

```
apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
  name: default
spec:
  logSeverityScreen: Info
  nodeToNodeMeshEnabled: true
  asNumber: 65533
  serviceClusterIPs:
  - cidr: 2a0a:e5c0:aaaa::/108
  serviceExternalIPs:
  - cidr: 2a0a:e5c0:aaaa::/108
```

Plus for each server and router we create a BGPPeer:

```
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
  name: serverXX
spec:
  peerIP: 2a0a:e5c0::XX
  asNumber: 65530
  keepOriginalNextHop: true
```

We apply the whole configuration using calicoctl:

```
./calicoctl create -f - < ~/vcs/k8s-config/bootstrap/p5-cow/calico-bgp.yaml
```

And a few seconds later we can observer the routes on the old / native
hosts:

```
bird> show protocols
Name       Proto      Table      State  Since         Info
device1    Device     ---        up     23:09:01.393
kernel1    Kernel     master6    up     23:09:01.393
k8s        BGP        ---        start  23:09:01.393  Passive
k8s_1      BGP        ---        up     23:33:01.215  Established
k8s_2      BGP        ---        up     23:33:01.215  Established
k8s_3      BGP        ---        up     23:33:01.420  Established
k8s_4      BGP        ---        up     23:33:01.215  Established
k8s_5      BGP        ---        up     23:33:01.215  Established

```

## Changelog

### 2022-08-27

* The initial release of this blog article


## Follow up or questions

You can join the discussion in the matrix room `#kubernetes:ungleich.ch`
about this migration. If don't have a matrix
account you can join using our chat on https://chat.with.ungleich.ch.
++blog for rook/ceph/nautilus 2022-08-27 16:37:42 +00:00			`title: [WIP] Migrating Ceph Nautilus into Kubernetes + Rook`
			`---`
			`pub_date: 2022-08-27`
			`---`
			`author: ungleich storage team`
			`---`
			`twitter_handle: ungleich`
			`---`
			`_hidden: no`
			`---`
			`_discoverable: yes`
			`---`
			`abstract:`
			`How we move our Ceph clusters into kubernetes`
			`---`
			`body:`

			`## Introduction`

			`At ungleich we are running multiple Ceph clusters. Some of them are`
			`running Ceph Nautilus (14.x) based on`
			`[Devuan](https://www.devuan.org/). Our newer Ceph Pacific (16.x)`
			`clusters are running based on [Rook](https://rook.io/) on`
			`[Kubernetes](https://kubernetes.io/) on top of`
			`[Alpine Linux](https://alpinelinux.org/).`

			`In this blog article we will describe how to migrate`
			`Ceph/Native/Devuan to Ceph/k8s+rook/Alpine Linux.`

			`## Work in Progress [WIP]`

			`This blog article is work in progress. The migration planning has`
			`started, however the migration has not been finished yet. This article`
			`will feature the different paths we take for the migration.`

			`## The Plan`

			`To continue operating the cluster during the migration, the following`
			`steps are planned:`

			`* Setup a k8s cluster that can potentially communicate with the`
			`existing ceph cluster`
			`* Using the [disaster`
			`recovery](https://rook.io/docs/rook/v1.9/Troubleshooting/disaster-recovery/)`
			`guidelines from rook to modify the rook configuration to use the`
			`previous fsid.`
			`* Spin up ceph monitors and ceph managers in rook`
			`* Retire existing monitors`
			`* Shutdown a ceph OSD node, remove it's OS disk, boot it with Alpine`
			`Linux`
			`* Join the node into the k8s cluster`
			`* Have rook pickup the existing disks and start the osds`
			`* Repeat if successful`
			`* Migrate to ceph pacific`

blog: +k8s migration / ceph 2022-08-27 21:43:48 +00:00			`### Original cluster`
++blog for rook/ceph/nautilus 2022-08-27 16:37:42 +00:00
			`The target ceph cluster we want to migrate lives in the 2a0a:e5c0::/64`
			`network. Ceph is using:`

			```
			`public network = 2a0a:e5c0:0:0::/64`
			`cluster network = 2a0a:e5c0:0:0::/64`
			```

blog: +k8s migration / ceph 2022-08-27 21:43:48 +00:00			`### Kubernetes cluster networking inside the ceph network`
++blog for rook/ceph/nautilus 2022-08-27 16:37:42 +00:00
			`To be able to communicate with the existing OSDs, we will be using`
			`sub networks of 2a0a:e5c0::/64 for kubernetes. As these networks`
			`are part of the link assigned network 2a0a:e5c0::/64, we will use BGP`
			`routing on the existing ceph nodes to create more specific routes into`
			`the kubernetes cluster.`

			`As we plan to use either [cilium](https://cilium.io/) or`
			`[calico](https://www.tigera.io/project-calico/) as the CNI, we can`
			`configure kubernetes to directly BGP peer with the existing Ceph`
			`nodes.`

blog: +k8s migration / ceph 2022-08-27 21:43:48 +00:00			`## The setup`

			`### Kubernetes Bootstrap`

			`As usual we bootstrap 3 control plane nodes using kubeadm. The proxy`
			`for the API resides in a different kuberentes cluster.`

			`We run`

			```
			`kubeadm init --config kubeadm.yaml`
			```

			`on the first node and join the other two control plane nodes. As`
			`usual, joining the workers last.`

			`### k8s Networking / CNI`

			`For this setup we are using calico as described in the`
			`[ungleich kubernetes`
			`manual](https://redmine.ungleich.ch/projects/open-infrastructure/wiki/The_ungleich_kubernetes_infrastructure#section-23).`

			```
			`VERSION=v3.23.3`
			`helm repo add projectcalico https://docs.projectcalico.org/charts`
			`helm upgrade --install --namespace tigera calico projectcalico/tigera-operator --version $VERSION --create-namespace`
			```

			`### BGP Networking on the old nodes`

			`To be able to import the BGP routes from Kubernetes, all old / native`
			`hosts will run bird. The installation and configuration is as follows:`

			```
			`apt-get update`
			`apt-get install -y bird2`

			`router_id=$(hostname \| sed 's/server//')`

			`cat > /etc/bird/bird.conf <<EOF`

			`router id $router_id;`

			`log syslog all;`
			`protocol device {`
			`}`
			`# We are only interested in IPv6, skip another section for IPv4`
			`protocol kernel {`
			`ipv6 { export all; };`
			`}`
			`protocol bgp k8s {`
			`local as 65530;`
			`neighbor range 2a0a:e5c0::/64 as 65533;`
			`dynamic name "k8s_"; direct;`

			`ipv6 {`
			`import filter { if net.len > 64 then accept; else reject; };`
			`export none;`
			`};`
			`}`
			`EOF`
			`/etc/init.d/bird restart`

			```

			`The router id must be adjusted for every host. As all hosts have a`
			`unique number, we use that one as the router id.`
			`The bird configuration allows to use dynamic peers so that any k8s`
			`node in the network can peer with the old servers.`

			`We also use a filter to avoid receiving /64 routes, as they are`
			`overlapping with the on link route.`

			`### BGP networking in Kubernetes`

			`Calico supports BGP peering and we use a rather standard calico`
			`configuration:`

			```
			`apiVersion: projectcalico.org/v3`
			`kind: BGPConfiguration`
			`metadata:`
			`name: default`
			`spec:`
			`logSeverityScreen: Info`
			`nodeToNodeMeshEnabled: true`
			`asNumber: 65533`
			`serviceClusterIPs:`
			`- cidr: 2a0a:e5c0:aaaa::/108`
			`serviceExternalIPs:`
			`- cidr: 2a0a:e5c0:aaaa::/108`
			```

			`Plus for each server and router we create a BGPPeer:`

			```
			`apiVersion: projectcalico.org/v3`
			`kind: BGPPeer`
			`metadata:`
			`name: serverXX`
			`spec:`
			`peerIP: 2a0a:e5c0::XX`
			`asNumber: 65530`
			`keepOriginalNextHop: true`
			```

			`We apply the whole configuration using calicoctl:`

			```
			`./calicoctl create -f - < ~/vcs/k8s-config/bootstrap/p5-cow/calico-bgp.yaml`
			```

			`And a few seconds later we can observer the routes on the old / native`
			`hosts:`

			```
			`bird> show protocols`
			`Name Proto Table State Since Info`
			`device1 Device --- up 23:09:01.393`
			`kernel1 Kernel master6 up 23:09:01.393`
			`k8s BGP --- start 23:09:01.393 Passive`
			`k8s_1 BGP --- up 23:33:01.215 Established`
			`k8s_2 BGP --- up 23:33:01.215 Established`
			`k8s_3 BGP --- up 23:33:01.420 Established`
			`k8s_4 BGP --- up 23:33:01.215 Established`
			`k8s_5 BGP --- up 23:33:01.215 Established`

			```
++blog for rook/ceph/nautilus 2022-08-27 16:37:42 +00:00
			`## Changelog`

			`### 2022-08-27`

			`* The initial release of this blog article`


			`## Follow up or questions`

Fix k8s / matrix link 2022-08-27 16:38:58 +00:00			You can join the discussion in the matrix room `#kubernetes:ungleich.ch`
			`about this migration. If don't have a matrix`
++blog for rook/ceph/nautilus 2022-08-27 16:37:42 +00:00			`account you can join using our chat on https://chat.with.ungleich.ch.`