ungleich-k8s/README.md
Nico Schottelius 30384ff32a ++stuff
2021-06-20 10:27:20 +02:00

7.4 KiB

IPv6 only kubernetes clusters

This project is testing, deploying and using IPv6 only k8s clusters.

Docs

Working

  • networking (calico)
  • ceph with rook (cephfs, rbd)
  • letsencrypt (nginx, certbot, homemade)
  • k8s test on arm64

Not (yet) working or tested

  • virtualisation (VMs, kubevirt)
  • network policies
  • prometheus in the cluster
  • argocd (?) for CI and upgrades
  • Maybe LoadBalancer support (our ClusterIP already does that though)
  • (Other) DNS entrys for services

Cluster setup

  • Calico CNI with BGP peering to our upstream infrastructure
  • Rook for RBD and CephFS support

The following steps are a full walk through on setting up the IPv6 only kubernetes cluster "c2.k8s.ooo".

Initialise the master with kubeadm

We are using a custom kubeadm.conf to

  • configure the cgroupdriver (for alpine)
  • configure the IP addresses
  • configure the DNS domain (c2.k8s.ooo)
kubeadm init --config k8s/c2/kubeadm.yaml

Adding worker nodes

kubeadm join [2a0a:e5c0:13:0:225:b3ff:fe20:38cc]:6443 --token cfrita.. \
        --discovery-token-ca-cert-hash sha256:...

Verifying that all nodes joined:

% kubectl get nodes
NAME       STATUS   ROLES                  AGE     VERSION
server47   Ready    control-plane,master   2m25s   v1.21.1
server48   Ready    <none>                 66s     v1.21.1
server49   Ready    <none>                 24s     v1.21.1
server50   Ready    <none>                 19s     v1.21.1

Configuring networking

  • This customised calico.yaml enables IPv6
kubectl apply -f cni-calico/calico.yaml

After applying, check that all calico pods are up and running:

% kubectl -n kube-system get pods
NAME                                      READY   STATUS    RESTARTS   AGE
calico-kube-controllers-b656ddcfc-5kfg6   0/1     Running   4          3m27s
calico-node-975vh                         1/1     Running   3          3m28s
calico-node-gbnvj                         1/1     Running   2          3m28s
calico-node-qjm5v                         0/1     Running   4          113s
calico-node-xxxmk                         1/1     Running   4          3m28s
coredns-558bd4d5db-56dv9                  1/1     Running   0          8m51s
coredns-558bd4d5db-hsspb                  1/1     Running   0          8m51s
etcd-server47                             1/1     Running   0          9m9s
kube-apiserver-server47                   1/1     Running   0          9m4s
kube-controller-manager-server47          1/1     Running   0          9m4s
kube-proxy-5g5qm                          1/1     Running   0          8m51s
kube-proxy-85mck                          1/1     Running   0          7m8s
kube-proxy-b95sv                          1/1     Running   0          7m13s
kube-proxy-mpjkm                          1/1     Running   0          7m55s
kube-scheduler-server47                   1/1     Running   0          9m10s

Often you will have some pods crashing in the beginning and you might need to make mounts shared (if they are not) like this:

mount --make-shared /sys
mount --make-shared /run

(above mounts are necessary for Alpine Linux)

Getting calicoctl

To configure calico, we need calicoctl, which we can run in yet-another-pod as following:

kubectl apply -f https://docs.projectcalico.org/manifests/calicoctl.yaml

And we alias it for easier usage:

alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl"

Adding BGP peering

We need to tell calico with which BGP peers to peer with. For this we use the bgp-c2.yaml file, which has configurations fitting for our cluster:

calicoctl create -f - < cni-calico/bgp-c2.yaml

At this point all nodes should be peering with our upstream infrastructure. We can confirm this on the upstream side, where we also run bird:

% birdc show route
BIRD 2.0.7 ready.
Table master6:
2a0a:e5c0:13:e1:f4c5:ab65:a67f:53c0/122 unicast [place7-server1 20:04:14.222] * (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0
                     unicast [place7-server3 20:04:14.224] (100) [AS65534i]
	via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0
                     unicast [place7-server2 20:04:14.222] (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0
                     unicast [place7-server4 20:04:14.221] (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0
2a0a:e5c0:13:e2::/108 unicast [place7-server1 20:04:14.222] * (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0
                     unicast [place7-server2 20:04:14.222] (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0
                     unicast [place7-server3 20:04:14.113] (100) [AS65534i]
	via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0
                     unicast [place7-server4 20:04:14.221] (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0
2a0a:e5c0:13:e1:176b:eaa6:6d47:1c40/122 unicast [place7-server1 20:04:14.222] * (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0
                     unicast [place7-server2 20:04:14.222] (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0
                     unicast [place7-server3 20:04:14.221] (100) [AS65534i]
	via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0
                     unicast [place7-server4 20:04:14.221] (100) [AS65534i]
	via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0
2a0a:e5c0:13::/48    unreachable [v6 2021-05-16] * (200)

Testing the cluster

At this point we should have a functioning k8s cluster, now we should test whether it works using a simple nginx deployment:

Do NOT use https://k8s.io/examples/application/deployment.yaml. It contains an outdated nginx container that has no IPv6 listener. You will get results such as

% curl http://[2a0a:e5c0:13:bbb:176b:eaa6:6d47:1c41]
curl: (7) Failed to connect to 2a0a:e5c0:13:bbb:176b:eaa6:6d47:1c41 port 80: Connection refused

if you use that deployment. Instead use something on the line of the included nginx-test-deployment.yaml:

kubectl apply -f generic/nginx-test-deployment.yaml

Let's see whether the pods are coming up:

% kubectl get pods
NAME                               READY   STATUS    RESTARTS   AGE
nginx-deployment-95d596f7b-484mz   1/1     Running   0          13s
nginx-deployment-95d596f7b-4wfkp   1/1     Running   0          13s

And the associated service:

% kubectl get svc
NAME            TYPE        CLUSTER-IP              EXTERNAL-IP   PORT(S)   AGE
kubernetes      ClusterIP   2a0a:e5c0:13:e2::1      <none>        443/TCP   16m
nginx-service   ClusterIP   2a0a:e5c0:13:e2::4412   <none>        80/TCP    34s

It is up and running, let's curl it!

% curl -I http://[2a0a:e5c0:13:e2::4412]
HTTP/1.1 200 OK
Server: nginx/1.20.0
Date: Mon, 14 Jun 2021 18:08:29 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 20 Apr 2021 16:11:05 GMT
Connection: keep-alive
ETag: "607efd19-264"
Accept-Ranges: bytes

Perfect. Let's delete it again:

kubectl delete -f generic/nginx-test-deployment.yaml

Next steps

While above is already a fully running k8s cluster, we do want to have support for PersistentVolumeclaims. See the rook documentation on how to achieve the next step.

The IPv4 "problem"

  • Clusters are IPv6 only
  • Need to have one or more services to map IPv4
  • Maybe outside haproxy w/ generic ssl/sni/host mapping
    • Could even be inside haproxy service