## IPv6 only kubernetes clusters This project is testing, deploying and using IPv6 only k8s clusters. ## Docs * [Setting up the cluster with calico](v3-calico/README.md) * [Bootstrapping Rook](rook/README.md) ## Working * networking (calico) * ceph with rook (cephfs, rbd) ## Not (yet) working or tested * virtualisation (VMs, kubevirt) * letsencrypt * network policies * prometheus in the cluster * argocd (?) for CI and upgrades * Maybe LoadBalancer support (our ClusterIP already does that though) * (Other) DNS entrys for services ## Cluster setup * Calico CNI with BGP peering to our upstream infrastructure * Rook for RBD and CephFS support The following steps are a full walk through on setting up the IPv6 only kubernetes cluster "c2.k8s.ooo". ### Initialise the master with kubeadm We are using a custom kubeadm.conf to * configure the cgroupdriver * configure the IP addresses * configure the DNS domain (c2.k8s.ooo) ``` kubeadm init --config k8s/c2/kubeadm.yaml ``` ### Adding worker nodes ``` kubeadm join [2a0a:e5c0:13:0:225:b3ff:fe20:38cc]:6443 --token cfrita.. \ --discovery-token-ca-cert-hash sha256:... ``` Verifying that all nodes joined: ``` % kubectl get nodes NAME STATUS ROLES AGE VERSION server47 Ready control-plane,master 2m25s v1.21.1 server48 Ready 66s v1.21.1 server49 Ready 24s v1.21.1 server50 Ready 19s v1.21.1 ``` ### Configuring networking * This customised calico.yaml enables IPv6 ``` kubectl apply -f cni-calico/calico.yaml ``` After applying, check that all calico pods are up and running: ``` % kubectl -n kube-system get pods NAME READY STATUS RESTARTS AGE calico-kube-controllers-b656ddcfc-5kfg6 0/1 Running 4 3m27s calico-node-975vh 1/1 Running 3 3m28s calico-node-gbnvj 1/1 Running 2 3m28s calico-node-qjm5v 0/1 Running 4 113s calico-node-xxxmk 1/1 Running 4 3m28s coredns-558bd4d5db-56dv9 1/1 Running 0 8m51s coredns-558bd4d5db-hsspb 1/1 Running 0 8m51s etcd-server47 1/1 Running 0 9m9s kube-apiserver-server47 1/1 Running 0 9m4s kube-controller-manager-server47 1/1 Running 0 9m4s kube-proxy-5g5qm 1/1 Running 0 8m51s kube-proxy-85mck 1/1 Running 0 7m8s kube-proxy-b95sv 1/1 Running 0 7m13s kube-proxy-mpjkm 1/1 Running 0 7m55s kube-scheduler-server47 1/1 Running 0 9m10s ``` Often you will have some pods crashing in the beginning and you might need to make mounts shared (if they are not) like this: ``` mount --make-shared /sys mount --make-shared /run ``` (above mounts are necessary for Alpine Linux) ### Getting calicoctl To configure calico, we need calicoctl, which we can run in yet-another-pod as following: ``` kubectl apply -f https://docs.projectcalico.org/manifests/calicoctl.yaml ``` And we alias it for easier usage: ``` alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl" ``` ### Adding BGP peering We need to tell calico with which BGP peers to peer with. For this we use the bgp-c2.yaml file, which has configurations fitting for our cluster: ``` calicoctl create -f - < cni-calico/bgp-c2.yaml ``` At this point all nodes should be peering with our upstream infrastructure. We can confirm this on the upstream side, where we also run bird: ``` % birdc show route BIRD 2.0.7 ready. Table master6: 2a0a:e5c0:13:e1:f4c5:ab65:a67f:53c0/122 unicast [place7-server1 20:04:14.222] * (100) [AS65534i] via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0 unicast [place7-server3 20:04:14.224] (100) [AS65534i] via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0 unicast [place7-server2 20:04:14.222] (100) [AS65534i] via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0 unicast [place7-server4 20:04:14.221] (100) [AS65534i] via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0 2a0a:e5c0:13:e2::/108 unicast [place7-server1 20:04:14.222] * (100) [AS65534i] via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0 unicast [place7-server2 20:04:14.222] (100) [AS65534i] via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0 unicast [place7-server3 20:04:14.113] (100) [AS65534i] via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0 unicast [place7-server4 20:04:14.221] (100) [AS65534i] via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0 2a0a:e5c0:13:e1:176b:eaa6:6d47:1c40/122 unicast [place7-server1 20:04:14.222] * (100) [AS65534i] via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0 unicast [place7-server2 20:04:14.222] (100) [AS65534i] via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0 unicast [place7-server3 20:04:14.221] (100) [AS65534i] via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0 unicast [place7-server4 20:04:14.221] (100) [AS65534i] via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0 2a0a:e5c0:13::/48 unreachable [v6 2021-05-16] * (200) ``` ### Testing the cluster At this point we should have a functioning k8s cluster, now we should test whether it works using a simple nginx deployment: Do *NOT* use https://k8s.io/examples/application/deployment.yaml. It contains an outdated nginx container that has no IPv6 listener. You will get results such as ``` % curl http://[2a0a:e5c0:13:bbb:176b:eaa6:6d47:1c41] curl: (7) Failed to connect to 2a0a:e5c0:13:bbb:176b:eaa6:6d47:1c41 port 80: Connection refused ``` if you use that deployment. Instead use something on the line of the included **nginx-test-deployment.yaml**: ``` kubectl apply -f generic/nginx-test-deployment.yaml ``` Let's see whether the pods are coming up: ``` % kubectl get pods NAME READY STATUS RESTARTS AGE nginx-deployment-95d596f7b-484mz 1/1 Running 0 13s nginx-deployment-95d596f7b-4wfkp 1/1 Running 0 13s ``` And the associated service: ``` % kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 2a0a:e5c0:13:e2::1 443/TCP 16m nginx-service ClusterIP 2a0a:e5c0:13:e2::4412 80/TCP 34s ``` It is up and running, let's curl it! ``` % curl -I http://[2a0a:e5c0:13:e2::4412] HTTP/1.1 200 OK Server: nginx/1.20.0 Date: Mon, 14 Jun 2021 18:08:29 GMT Content-Type: text/html Content-Length: 612 Last-Modified: Tue, 20 Apr 2021 16:11:05 GMT Connection: keep-alive ETag: "607efd19-264" Accept-Ranges: bytes ``` Perfect. Let's delete it again: ``` kubectl delete -f generic/nginx-test-deployment.yaml ``` ### Next steps While above is already a fully running k8s cluster, we do want to have support for **PersistentVolumeclaims**. See [the rook documentation](rook/README.md) on how to achieve the next step. ## The IPv4 "problem" * Clusters are IPv6 only * Need to have one or more services to map IPv4 * Maybe outside haproxy w/ generic ssl/sni/host mapping * Could even be **inside** haproxy service