91 lines
2.6 KiB
Markdown
91 lines
2.6 KiB
Markdown
title: [WIP] Migrating Ceph Nautilus into Kubernetes + Rook
|
|
---
|
|
pub_date: 2022-08-27
|
|
---
|
|
author: ungleich storage team
|
|
---
|
|
twitter_handle: ungleich
|
|
---
|
|
_hidden: no
|
|
---
|
|
_discoverable: yes
|
|
---
|
|
abstract:
|
|
How we move our Ceph clusters into kubernetes
|
|
---
|
|
body:
|
|
|
|
## Introduction
|
|
|
|
At ungleich we are running multiple Ceph clusters. Some of them are
|
|
running Ceph Nautilus (14.x) based on
|
|
[Devuan](https://www.devuan.org/). Our newer Ceph Pacific (16.x)
|
|
clusters are running based on [Rook](https://rook.io/) on
|
|
[Kubernetes](https://kubernetes.io/) on top of
|
|
[Alpine Linux](https://alpinelinux.org/).
|
|
|
|
In this blog article we will describe how to migrate
|
|
Ceph/Native/Devuan to Ceph/k8s+rook/Alpine Linux.
|
|
|
|
## Work in Progress [WIP]
|
|
|
|
This blog article is work in progress. The migration planning has
|
|
started, however the migration has not been finished yet. This article
|
|
will feature the different paths we take for the migration.
|
|
|
|
## The Plan
|
|
|
|
To continue operating the cluster during the migration, the following
|
|
steps are planned:
|
|
|
|
* Setup a k8s cluster that can potentially communicate with the
|
|
existing ceph cluster
|
|
* Using the [disaster
|
|
recovery](https://rook.io/docs/rook/v1.9/Troubleshooting/disaster-recovery/)
|
|
guidelines from rook to modify the rook configuration to use the
|
|
previous fsid.
|
|
* Spin up ceph monitors and ceph managers in rook
|
|
* Retire existing monitors
|
|
* Shutdown a ceph OSD node, remove it's OS disk, boot it with Alpine
|
|
Linux
|
|
* Join the node into the k8s cluster
|
|
* Have rook pickup the existing disks and start the osds
|
|
* Repeat if successful
|
|
* Migrate to ceph pacific
|
|
|
|
## Original cluster
|
|
|
|
The target ceph cluster we want to migrate lives in the 2a0a:e5c0::/64
|
|
network. Ceph is using:
|
|
|
|
```
|
|
public network = 2a0a:e5c0:0:0::/64
|
|
cluster network = 2a0a:e5c0:0:0::/64
|
|
```
|
|
|
|
## Kubernetes cluster networking inside the ceph network
|
|
|
|
To be able to communicate with the existing OSDs, we will be using
|
|
sub networks of 2a0a:e5c0::/64 for kubernetes. As these networks
|
|
are part of the link assigned network 2a0a:e5c0::/64, we will use BGP
|
|
routing on the existing ceph nodes to create more specific routes into
|
|
the kubernetes cluster.
|
|
|
|
As we plan to use either [cilium](https://cilium.io/) or
|
|
[calico](https://www.tigera.io/project-calico/) as the CNI, we can
|
|
configure kubernetes to directly BGP peer with the existing Ceph
|
|
nodes.
|
|
|
|
|
|
## Changelog
|
|
|
|
### 2022-08-27
|
|
|
|
* The initial release of this blog article
|
|
|
|
|
|
## Follow up or questions
|
|
|
|
You can join the discussion in the matrix room
|
|
#kubernetes:ungleich.ch about this migration. If don't have a matrix
|
|
account you can join using our chat on https://chat.with.ungleich.ch.
|