ungleich-k8s/README.md

## IPv6 only kubernetes clusters

This project is testing, deploying and using IPv6 only k8s clusters.

## Docs

* [Setting up the cluster with calico](v3-calico/README.md)
* [Bootstrapping Rook](rook/README.md)

## Working

* networking (calico)
* ceph with rook (cephfs, rbd)
* letsencrypt (nginx, certbot, homemade)
* k8s test on arm64
* CI/CD using flux
* Chart repository (chartmuseum)
* Git repository (gitea)

## Not (yet) working or tested

* proxy for pulling images only
  * configure a proxy on crio
  * setup a proxy in the cluster (?)
* virtualisation (VMs, kubevirt)
* network policies
* Prometheus for the cluster
* Maybe LoadBalancer support (our ClusterIP already does that though)
* (Other) DNS entrys for services
* Internal backup / snapshots
* External backup (rsync, rbd mirror, etc.)

## Cluster setup

* Calico CNI with BGP peering to our upstream infrastructure
* Rook for RBD and CephFS support

The following steps are a full walk through on setting up the
IPv6 only kubernetes cluster "c2.k8s.ooo".

### Initialise the master with kubeadm

We are using a custom kubeadm.conf to

* configure the cgroupdriver (for alpine)
* configure the IP addresses
* configure the DNS domain (c2.k8s.ooo)

```
kubeadm init --config k8s/c2/kubeadm.yaml
```

### Adding worker nodes

```
kubeadm join [2a0a:e5c0:13:0:225:b3ff:fe20:38cc]:6443 --token cfrita.. \
        --discovery-token-ca-cert-hash sha256:...
```

Verifying that all nodes joined:

```
% kubectl get nodes
NAME       STATUS   ROLES                  AGE     VERSION
server47   Ready    control-plane,master   2m25s   v1.21.1
server48   Ready    <none>                 66s     v1.21.1
server49   Ready    <none>                 24s     v1.21.1
server50   Ready    <none>                 19s     v1.21.1

```

### Configuring networking

* This customised calico.yaml enables IPv6

```
kubectl apply -f cni-calico/calico.yaml
```

After applying, check that all calico pods are up and running:

```
% kubectl -n kube-system get pods
NAME                                      READY   STATUS    RESTARTS   AGE
calico-kube-controllers-b656ddcfc-5kfg6   0/1     Running   4          3m27s
calico-node-975vh                         1/1     Running   3          3m28s
calico-node-gbnvj                         1/1     Running   2          3m28s
calico-node-qjm5v                         0/1     Running   4          113s
calico-node-xxxmk                         1/1     Running   4          3m28s
coredns-558bd4d5db-56dv9                  1/1     Running   0          8m51s
coredns-558bd4d5db-hsspb                  1/1     Running   0          8m51s
etcd-server47                             1/1     Running   0          9m9s
kube-apiserver-server47                   1/1     Running   0          9m4s
kube-controller-manager-server47          1/1     Running   0          9m4s
kube-proxy-5g5qm                          1/1     Running   0          8m51s
kube-proxy-85mck                          1/1     Running   0          7m8s
kube-proxy-b95sv                          1/1     Running   0          7m13s
kube-proxy-mpjkm                          1/1     Running   0          7m55s
kube-scheduler-server47                   1/1     Running   0          9m10s
```

Often you will have some pods crashing in the beginning and you might
need to make mounts shared (if they are not) like this:

```
mount --make-shared /sys
mount --make-shared /run
```

(above mounts are necessary for Alpine Linux)

### Getting calicoctl

To configure calico, we need calicoctl, which we can run in
yet-another-pod as following:

```
kubectl apply -f https://docs.projectcalico.org/manifests/calicoctl.yaml
```

And we alias it for easier usage:

```
alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl"
```

### Adding BGP peering

We need to tell calico with which BGP peers to peer with. For this we
use the bgp-c2.yaml file, which has configurations fitting for our
cluster:


```
calicoctl create -f - < cni-calico/bgp-c2.yaml
```

At this point all nodes should be peering with our upstream
infrastructure.
We can confirm this on the upstream side, where we also run bird:

```
% birdc show route
BIRD 2.0.7 ready.
Table master6:
2a0a:e5c0:13:e1:f4c5:ab65:a67f:53c0/122 unicast [place7-srever1 20:04:14.222] * (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0
                     unicast [place7-server3 20:04:14.224] (100) [AS65534i]
	via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0
                     unicast [place7-server2 20:04:14.222] (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0
                     unicast [place7-server4 20:04:14.221] (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0
2a0a:e5c0:13:e2::/108 unicast [place7-server1 20:04:14.222] * (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0
                     unicast [place7-server2 20:04:14.222] (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0
                     unicast [place7-server3 20:04:14.113] (100) [AS65534i]
	via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0
                     unicast [place7-server4 20:04:14.221] (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0
2a0a:e5c0:13:e1:176b:eaa6:6d47:1c40/122 unicast [place7-server1 20:04:14.222] * (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0
                     unicast [place7-server2 20:04:14.222] (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0
                     unicast [place7-server3 20:04:14.221] (100) [AS65534i]
	via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0
                     unicast [place7-server4 20:04:14.221] (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0
2a0a:e5c0:13::/48    unreachable [v6 2021-05-16] * (200)
```

### Testing the cluster

At this point we should have a functioning k8s cluster, now we should
test whether it works using a simple nginx deployment:

Do *NOT* use https://k8s.io/examples/application/deployment.yaml. It
contains an outdated nginx container that has no IPv6 listener. You
will get results such as

```
% curl http://[2a0a:e5c0:13:bbb:176b:eaa6:6d47:1c41]
curl: (7) Failed to connect to 2a0a:e5c0:13:bbb:176b:eaa6:6d47:1c41 port 80: Connection refused
```

if you use that deployment. Instead use something on the line of the
included **nginx-test-deployment.yaml**:

```
kubectl apply -f generic/nginx-test-deployment.yaml
```

Let's see whether the pods are coming up:

```
% kubectl get pods
NAME                               READY   STATUS    RESTARTS   AGE
nginx-deployment-95d596f7b-484mz   1/1     Running   0          13s
nginx-deployment-95d596f7b-4wfkp   1/1     Running   0          13s
```

And the associated service:

```
% kubectl get svc
NAME            TYPE        CLUSTER-IP              EXTERNAL-IP   PORT(S)   AGE
kubernetes      ClusterIP   2a0a:e5c0:13:e2::1      <none>        443/TCP   16m
nginx-service   ClusterIP   2a0a:e5c0:13:e2::4412   <none>        80/TCP    34s
```

It is up and running, let's curl it!

```
% curl -I http://[2a0a:e5c0:13:e2::4412]
HTTP/1.1 200 OK
Server: nginx/1.20.0
Date: Mon, 14 Jun 2021 18:08:29 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 20 Apr 2021 16:11:05 GMT
Connection: keep-alive
ETag: "607efd19-264"
Accept-Ranges: bytes
```

Perfect. Let's delete it again:

```
kubectl delete -f generic/nginx-test-deployment.yaml
```

### Next steps

While above is already a fully running k8s cluster, we do want to have
support for **PersistentVolumeclaims**. See [the rook
documentation](rook/README.md) on how to achieve the next step.

## High available control plan

Above steps result in a single control plane node, however for
production setups, three nodes should be in the control plane.

The [guide for creating HA
clusters](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/)
referes to an external load balancer that

## Secrets

### Generating them inside the cluster

Handled via https://github.com/mittwald/kubernetes-secret-generator

```
helm repo add mittwald https://helm.mittwald.de
helm repo update
helm upgrade --install kubernetes-secret-generator mittwald/kubernetes-secret-generator
```

Generating / creating secrets:

```
apiVersion: v1
kind: Secret
metadata:
  name: string-secret
  annotations:
    secret-generator.v1.mittwald.de/autogenerate: password
data:
  username: c29tZXVzZXI=
```

* Advantage: passwords are only in the cluster
* Disadvantage: passwords are only in the cluster

## CI/CD

### What we want

* Package everything into one git repository (charts, kustomize, etc.)
* Be usable for multiple clusters
* Easily apply cross cluster

### What we don't want / what is problematic

* Uploading charts to something like chartmuseum
  * Is redundant - we have a version in git
  * Is manual (could probably be automated)

### ArgoCD

Looks too big, too complex, too complicated.

### FluxCD2

Looks ok, handling of helm is ok, but does not feel intuitive. Seems
to be more orientated on "kustomizing helm charts".

### Helmfile

[helmfile](https://github.com/roboll/helmfile/) seems to do most of
what we need.

## The IPv4 "problem"

* Clusters are IPv6 only
* Need to have one or more services to map IPv4
* Maybe outside haproxy w/ generic ssl/sni/host mapping
  * Could even be **inside** haproxy service

## Flux + Chartmuseum

* For automatic deployments, we can use flux
* To be able to use flux with our charts, we need a Chartmuseum
* To access a private chartmuseum, we need a shared secret
* Thus we probably do need sops or similar

-alternative-

* Using kustomize, local resources can be used
Starting global readme 2021-06-09 08:14:17 +00:00			`## IPv6 only kubernetes clusters`

			`This project is testing, deploying and using IPv6 only k8s clusters.`

++index link 2021-06-09 18:14:49 +00:00			`## Docs`

			`* [Setting up the cluster with calico](v3-calico/README.md)`
++docs 2021-06-09 18:25:21 +00:00			`* [Bootstrapping Rook](rook/README.md)`
++index link 2021-06-09 18:14:49 +00:00
Starting global readme 2021-06-09 08:14:17 +00:00			`## Working`

			`* networking (calico)`
			`* ceph with rook (cephfs, rbd)`
++stuff 2021-06-20 08:27:20 +00:00			`* letsencrypt (nginx, certbot, homemade)`
			`* k8s test on arm64`
++readme 2021-07-19 22:00:46 +00:00			`* CI/CD using flux`
			`* Chart repository (chartmuseum)`
++docs 2021-07-21 11:12:36 +00:00			`* Git repository (gitea)`
Starting global readme 2021-06-09 08:14:17 +00:00
			`## Not (yet) working or tested`

++docs 2021-07-21 11:12:36 +00:00			`* proxy for pulling images only`
			`* configure a proxy on crio`
			`* setup a proxy in the cluster (?)`
Starting global readme 2021-06-09 08:14:17 +00:00			`* virtualisation (VMs, kubevirt)`
			`* network policies`
++readme 2021-07-19 22:00:46 +00:00			`* Prometheus for the cluster`
Starting global readme 2021-06-09 08:14:17 +00:00			`* Maybe LoadBalancer support (our ClusterIP already does that though)`
++index link 2021-06-09 18:14:49 +00:00			`* (Other) DNS entrys for services`
++readme/doc update 2021-06-20 12:26:28 +00:00			`* Internal backup / snapshots`
			`* External backup (rsync, rbd mirror, etc.)`
update general 2021-06-14 17:38:45 +00:00
			`## Cluster setup`

			`* Calico CNI with BGP peering to our upstream infrastructure`
			`* Rook for RBD and CephFS support`

Begin to sort things 2021-06-14 18:13:41 +00:00			`The following steps are a full walk through on setting up the`
			`IPv6 only kubernetes cluster "c2.k8s.ooo".`

			`### Initialise the master with kubeadm`
update general 2021-06-14 17:38:45 +00:00
			`We are using a custom kubeadm.conf to`

++stuff 2021-06-20 08:27:20 +00:00			`* configure the cgroupdriver (for alpine)`
update general 2021-06-14 17:38:45 +00:00			`* configure the IP addresses`
Begin to sort things 2021-06-14 18:13:41 +00:00			`* configure the DNS domain (c2.k8s.ooo)`

			```
			`kubeadm init --config k8s/c2/kubeadm.yaml`
			```

			`### Adding worker nodes`

			```
			`kubeadm join [2a0a:e5c0:13:0:225:b3ff:fe20:38cc]:6443 --token cfrita.. \`
			`--discovery-token-ca-cert-hash sha256:...`
			```

			`Verifying that all nodes joined:`

			```
			`% kubectl get nodes`
			`NAME STATUS ROLES AGE VERSION`
			`server47 Ready control-plane,master 2m25s v1.21.1`
			`server48 Ready <none> 66s v1.21.1`
			`server49 Ready <none> 24s v1.21.1`
			`server50 Ready <none> 19s v1.21.1`

			```

			`### Configuring networking`

			`* This customised calico.yaml enables IPv6`

			```
			`kubectl apply -f cni-calico/calico.yaml`
			```
update general 2021-06-14 17:38:45 +00:00
Begin to sort things 2021-06-14 18:13:41 +00:00			`After applying, check that all calico pods are up and running:`

			```
			`% kubectl -n kube-system get pods`
			`NAME READY STATUS RESTARTS AGE`
			`calico-kube-controllers-b656ddcfc-5kfg6 0/1 Running 4 3m27s`
			`calico-node-975vh 1/1 Running 3 3m28s`
			`calico-node-gbnvj 1/1 Running 2 3m28s`
			`calico-node-qjm5v 0/1 Running 4 113s`
			`calico-node-xxxmk 1/1 Running 4 3m28s`
			`coredns-558bd4d5db-56dv9 1/1 Running 0 8m51s`
			`coredns-558bd4d5db-hsspb 1/1 Running 0 8m51s`
			`etcd-server47 1/1 Running 0 9m9s`
			`kube-apiserver-server47 1/1 Running 0 9m4s`
			`kube-controller-manager-server47 1/1 Running 0 9m4s`
			`kube-proxy-5g5qm 1/1 Running 0 8m51s`
			`kube-proxy-85mck 1/1 Running 0 7m8s`
			`kube-proxy-b95sv 1/1 Running 0 7m13s`
			`kube-proxy-mpjkm 1/1 Running 0 7m55s`
			`kube-scheduler-server47 1/1 Running 0 9m10s`
			```

			`Often you will have some pods crashing in the beginning and you might`
			`need to make mounts shared (if they are not) like this:`

			```
			`mount --make-shared /sys`
			`mount --make-shared /run`
			```

			`(above mounts are necessary for Alpine Linux)`

			`### Getting calicoctl`

			`To configure calico, we need calicoctl, which we can run in`
			`yet-another-pod as following:`

			```
			`kubectl apply -f https://docs.projectcalico.org/manifests/calicoctl.yaml`
update general 2021-06-14 17:38:45 +00:00			```
Begin to sort things 2021-06-14 18:13:41 +00:00
			`And we alias it for easier usage:`

update general 2021-06-14 17:38:45 +00:00			```
Begin to sort things 2021-06-14 18:13:41 +00:00			`alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl"`
			```

			`### Adding BGP peering`

			`We need to tell calico with which BGP peers to peer with. For this we`
			`use the bgp-c2.yaml file, which has configurations fitting for our`
			`cluster:`


			```
			`calicoctl create -f - < cni-calico/bgp-c2.yaml`
			```

			`At this point all nodes should be peering with our upstream`
			`infrastructure.`
			`We can confirm this on the upstream side, where we also run bird:`

			```
			`% birdc show route`
			`BIRD 2.0.7 ready.`
			`Table master6:`
++matrix 2021-07-17 18:12:27 +00:00			`2a0a:e5c0:13:e1:f4c5:ab65:a67f:53c0/122 unicast [place7-srever1 20:04:14.222] * (100) [AS65534i]`
Begin to sort things 2021-06-14 18:13:41 +00:00			`via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0`
			`unicast [place7-server3 20:04:14.224] (100) [AS65534i]`
			`via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0`
			`unicast [place7-server2 20:04:14.222] (100) [AS65534i]`
			`via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0`
			`unicast [place7-server4 20:04:14.221] (100) [AS65534i]`
			`via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0`
			`2a0a:e5c0:13:e2::/108 unicast [place7-server1 20:04:14.222] * (100) [AS65534i]`
			`via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0`
			`unicast [place7-server2 20:04:14.222] (100) [AS65534i]`
			`via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0`
			`unicast [place7-server3 20:04:14.113] (100) [AS65534i]`
			`via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0`
			`unicast [place7-server4 20:04:14.221] (100) [AS65534i]`
			`via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0`
			`2a0a:e5c0:13:e1:176b:eaa6:6d47:1c40/122 unicast [place7-server1 20:04:14.222] * (100) [AS65534i]`
			`via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0`
			`unicast [place7-server2 20:04:14.222] (100) [AS65534i]`
			`via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0`
			`unicast [place7-server3 20:04:14.221] (100) [AS65534i]`
			`via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0`
			`unicast [place7-server4 20:04:14.221] (100) [AS65534i]`
			`via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0`
			`2a0a:e5c0:13::/48 unreachable [v6 2021-05-16] * (200)`
			```

			`### Testing the cluster`

			`At this point we should have a functioning k8s cluster, now we should`
			`test whether it works using a simple nginx deployment:`

			`Do NOT use https://k8s.io/examples/application/deployment.yaml. It`
			`contains an outdated nginx container that has no IPv6 listener. You`
			`will get results such as`

			```
			`% curl http://[2a0a:e5c0:13:bbb:176b:eaa6:6d47:1c41]`
			`curl: (7) Failed to connect to 2a0a:e5c0:13:bbb:176b:eaa6:6d47:1c41 port 80: Connection refused`
			```

			`if you use that deployment. Instead use something on the line of the`
			`included nginx-test-deployment.yaml:`

			```
			`kubectl apply -f generic/nginx-test-deployment.yaml`
			```

			`Let's see whether the pods are coming up:`

			```
			`% kubectl get pods`
			`NAME READY STATUS RESTARTS AGE`
			`nginx-deployment-95d596f7b-484mz 1/1 Running 0 13s`
			`nginx-deployment-95d596f7b-4wfkp 1/1 Running 0 13s`
			```

			`And the associated service:`

			```
			`% kubectl get svc`
			`NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE`
			`kubernetes ClusterIP 2a0a:e5c0:13:e2::1 <none> 443/TCP 16m`
			`nginx-service ClusterIP 2a0a:e5c0:13:e2::4412 <none> 80/TCP 34s`
			```

			`It is up and running, let's curl it!`

			```
			`% curl -I http://[2a0a:e5c0:13:e2::4412]`
			`HTTP/1.1 200 OK`
			`Server: nginx/1.20.0`
			`Date: Mon, 14 Jun 2021 18:08:29 GMT`
			`Content-Type: text/html`
			`Content-Length: 612`
			`Last-Modified: Tue, 20 Apr 2021 16:11:05 GMT`
			`Connection: keep-alive`
			`ETag: "607efd19-264"`
			`Accept-Ranges: bytes`
			```

			`Perfect. Let's delete it again:`

			```
			`kubectl delete -f generic/nginx-test-deployment.yaml`
			```

			`### Next steps`

			`While above is already a fully running k8s cluster, we do want to have`
			`support for PersistentVolumeclaims. See [the rook`
			`documentation](rook/README.md) on how to achieve the next step.`
++tests 2021-06-19 09:21:16 +00:00
++readme 2021-07-11 11:50:49 +00:00			`## High available control plan`

			`Above steps result in a single control plane node, however for`
			`production setups, three nodes should be in the control plane.`

			`The [guide for creating HA`
			`clusters](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/)`
			`referes to an external load balancer that`

++matrix 2021-07-17 18:12:27 +00:00			`## Secrets`
++readme 2021-07-11 11:50:49 +00:00
++readme 2021-07-19 22:00:46 +00:00			`### Generating them inside the cluster`

++matrix 2021-07-17 18:12:27 +00:00			`Handled via https://github.com/mittwald/kubernetes-secret-generator`

			```
			`helm repo add mittwald https://helm.mittwald.de`
			`helm repo update`
			`helm upgrade --install kubernetes-secret-generator mittwald/kubernetes-secret-generator`
			```

			`Generating / creating secrets:`

			```
			`apiVersion: v1`
			`kind: Secret`
			`metadata:`
			`name: string-secret`
			`annotations:`
			`secret-generator.v1.mittwald.de/autogenerate: password`
			`data:`
			`username: c29tZXVzZXI=`
			```

++readme 2021-07-19 22:00:46 +00:00			`* Advantage: passwords are only in the cluster`
			`* Disadvantage: passwords are only in the cluster`

			`## CI/CD`

			`### What we want`

			`* Package everything into one git repository (charts, kustomize, etc.)`
			`* Be usable for multiple clusters`
			`* Easily apply cross cluster`

			`### What we don't want / what is problematic`

			`* Uploading charts to something like chartmuseum`
			`* Is redundant - we have a version in git`
			`* Is manual (could probably be automated)`

			`### ArgoCD`

			`Looks too big, too complex, too complicated.`

			`### FluxCD2`

			`Looks ok, handling of helm is ok, but does not feel intuitive. Seems`
			`to be more orientated on "kustomizing helm charts".`

			`### Helmfile`

			`[helmfile](https://github.com/roboll/helmfile/) seems to do most of`
			`what we need.`
++readme 2021-07-11 11:50:49 +00:00
++tests 2021-06-19 09:21:16 +00:00			`## The IPv4 "problem"`

			`* Clusters are IPv6 only`
			`* Need to have one or more services to map IPv4`
			`* Maybe outside haproxy w/ generic ssl/sni/host mapping`
			`* Could even be inside haproxy service`
++readme 2021-07-21 21:52:17 +00:00
			`## Flux + Chartmuseum`

			`* For automatic deployments, we can use flux`
			`* To be able to use flux with our charts, we need a Chartmuseum`
			`* To access a private chartmuseum, we need a shared secret`
			`* Thus we probably do need sops or similar`

			`-alternative-`

			`* Using kustomize, local resources can be used`