alpine-linux | ||
apps | ||
ARCHIVE | ||
certificates | ||
cni-calico | ||
containers/ungleich-nginx | ||
eggdrop | ||
generic | ||
k8s | ||
kubevirt | ||
python | ||
rook | ||
tests | ||
v3-calico | ||
certificates-dns.md | ||
FLOW.md | ||
README.md |
IPv6 only kubernetes clusters
This project is testing, deploying and using IPv6 only k8s clusters.
Docs
Working
- networking (calico)
- ceph with rook (cephfs, rbd)
- letsencrypt (nginx, certbot, homemade)
- k8s test on arm64
- CI/CD using flux
- Chart repository (chartmuseum)
- Git repository (gitea)
Not (yet) working or tested
- proxy for pulling images only
- configure a proxy on crio
- setup a proxy in the cluster (?)
- virtualisation (VMs, kubevirt)
- network policies
- Prometheus for the cluster
- Maybe LoadBalancer support (our ClusterIP already does that though)
- (Other) DNS entrys for services
- Internal backup / snapshots
- External backup (rsync, rbd mirror, etc.)
Cluster setup
- Calico CNI with BGP peering to our upstream infrastructure
- Rook for RBD and CephFS support
The following steps are a full walk through on setting up the IPv6 only kubernetes cluster "c2.k8s.ooo".
Initialise the master with kubeadm
We are using a custom kubeadm.conf to
- configure the cgroupdriver (for alpine)
- configure the IP addresses
- configure the DNS domain (c2.k8s.ooo)
kubeadm init --config k8s/c2/kubeadm.yaml
Adding worker nodes
kubeadm join [2a0a:e5c0:13:0:225:b3ff:fe20:38cc]:6443 --token cfrita.. \
--discovery-token-ca-cert-hash sha256:...
Verifying that all nodes joined:
% kubectl get nodes
NAME STATUS ROLES AGE VERSION
server47 Ready control-plane,master 2m25s v1.21.1
server48 Ready <none> 66s v1.21.1
server49 Ready <none> 24s v1.21.1
server50 Ready <none> 19s v1.21.1
Configuring networking
- This customised calico.yaml enables IPv6
kubectl apply -f cni-calico/calico.yaml
After applying, check that all calico pods are up and running:
% kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-b656ddcfc-5kfg6 0/1 Running 4 3m27s
calico-node-975vh 1/1 Running 3 3m28s
calico-node-gbnvj 1/1 Running 2 3m28s
calico-node-qjm5v 0/1 Running 4 113s
calico-node-xxxmk 1/1 Running 4 3m28s
coredns-558bd4d5db-56dv9 1/1 Running 0 8m51s
coredns-558bd4d5db-hsspb 1/1 Running 0 8m51s
etcd-server47 1/1 Running 0 9m9s
kube-apiserver-server47 1/1 Running 0 9m4s
kube-controller-manager-server47 1/1 Running 0 9m4s
kube-proxy-5g5qm 1/1 Running 0 8m51s
kube-proxy-85mck 1/1 Running 0 7m8s
kube-proxy-b95sv 1/1 Running 0 7m13s
kube-proxy-mpjkm 1/1 Running 0 7m55s
kube-scheduler-server47 1/1 Running 0 9m10s
Often you will have some pods crashing in the beginning and you might need to make mounts shared (if they are not) like this:
mount --make-shared /sys
mount --make-shared /run
(above mounts are necessary for Alpine Linux)
Getting calicoctl
To configure calico, we need calicoctl, which we can run in yet-another-pod as following:
kubectl apply -f https://docs.projectcalico.org/manifests/calicoctl.yaml
And we alias it for easier usage:
alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl"
Adding BGP peering
We need to tell calico with which BGP peers to peer with. For this we use the bgp-c2.yaml file, which has configurations fitting for our cluster:
calicoctl create -f - < cni-calico/bgp-c2.yaml
At this point all nodes should be peering with our upstream infrastructure. We can confirm this on the upstream side, where we also run bird:
% birdc show route
BIRD 2.0.7 ready.
Table master6:
2a0a:e5c0:13:e1:f4c5:ab65:a67f:53c0/122 unicast [place7-srever1 20:04:14.222] * (100) [AS65534i]
via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0
unicast [place7-server3 20:04:14.224] (100) [AS65534i]
via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0
unicast [place7-server2 20:04:14.222] (100) [AS65534i]
via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0
unicast [place7-server4 20:04:14.221] (100) [AS65534i]
via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0
2a0a:e5c0:13:e2::/108 unicast [place7-server1 20:04:14.222] * (100) [AS65534i]
via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0
unicast [place7-server2 20:04:14.222] (100) [AS65534i]
via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0
unicast [place7-server3 20:04:14.113] (100) [AS65534i]
via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0
unicast [place7-server4 20:04:14.221] (100) [AS65534i]
via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0
2a0a:e5c0:13:e1:176b:eaa6:6d47:1c40/122 unicast [place7-server1 20:04:14.222] * (100) [AS65534i]
via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0
unicast [place7-server2 20:04:14.222] (100) [AS65534i]
via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0
unicast [place7-server3 20:04:14.221] (100) [AS65534i]
via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0
unicast [place7-server4 20:04:14.221] (100) [AS65534i]
via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0
2a0a:e5c0:13::/48 unreachable [v6 2021-05-16] * (200)
Testing the cluster
At this point we should have a functioning k8s cluster, now we should test whether it works using a simple nginx deployment:
Do NOT use https://k8s.io/examples/application/deployment.yaml. It contains an outdated nginx container that has no IPv6 listener. You will get results such as
% curl http://[2a0a:e5c0:13:bbb:176b:eaa6:6d47:1c41]
curl: (7) Failed to connect to 2a0a:e5c0:13:bbb:176b:eaa6:6d47:1c41 port 80: Connection refused
if you use that deployment. Instead use something on the line of the included nginx-test-deployment.yaml:
kubectl apply -f generic/nginx-test-deployment.yaml
Let's see whether the pods are coming up:
% kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment-95d596f7b-484mz 1/1 Running 0 13s
nginx-deployment-95d596f7b-4wfkp 1/1 Running 0 13s
And the associated service:
% kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 2a0a:e5c0:13:e2::1 <none> 443/TCP 16m
nginx-service ClusterIP 2a0a:e5c0:13:e2::4412 <none> 80/TCP 34s
It is up and running, let's curl it!
% curl -I http://[2a0a:e5c0:13:e2::4412]
HTTP/1.1 200 OK
Server: nginx/1.20.0
Date: Mon, 14 Jun 2021 18:08:29 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 20 Apr 2021 16:11:05 GMT
Connection: keep-alive
ETag: "607efd19-264"
Accept-Ranges: bytes
Perfect. Let's delete it again:
kubectl delete -f generic/nginx-test-deployment.yaml
Next steps
While above is already a fully running k8s cluster, we do want to have support for PersistentVolumeclaims. See the rook documentation on how to achieve the next step.
High available control plan
Above steps result in a single control plane node, however for production setups, three nodes should be in the control plane.
The guide for creating HA clusters referes to an external load balancer that
Secrets
Generating them inside the cluster
Handled via https://github.com/mittwald/kubernetes-secret-generator
helm repo add mittwald https://helm.mittwald.de
helm repo update
helm upgrade --install kubernetes-secret-generator mittwald/kubernetes-secret-generator
Generating / creating secrets:
apiVersion: v1
kind: Secret
metadata:
name: string-secret
annotations:
secret-generator.v1.mittwald.de/autogenerate: password
data:
username: c29tZXVzZXI=
- Advantage: passwords are only in the cluster
- Disadvantage: passwords are only in the cluster
CI/CD
What we want
- Package everything into one git repository (charts, kustomize, etc.)
- Be usable for multiple clusters
- Easily apply cross cluster
What we don't want / what is problematic
- Uploading charts to something like chartmuseum
- Is redundant - we have a version in git
- Is manual (could probably be automated)
ArgoCD
Looks too big, too complex, too complicated.
FluxCD2
Looks ok, handling of helm is ok, but does not feel intuitive. Seems to be more orientated on "kustomizing helm charts".
Helmfile
helmfile seems to do most of what we need.
The IPv4 "problem"
- Clusters are IPv6 only
- Need to have one or more services to map IPv4
- Maybe outside haproxy w/ generic ssl/sni/host mapping
- Could even be inside haproxy service
Flux + Chartmuseum
- For automatic deployments, we can use flux
- To be able to use flux with our charts, we need a Chartmuseum
- To access a private chartmuseum, we need a shared secret
- Thus we probably do need sops or similar
-alternative-
- Using kustomize, local resources can be used