++k8s/network planning

This commit is contained in:
Nico Schottelius 2021-06-26 14:37:31 +02:00
parent 0cb935b5aa
commit 6fdf4abb6f

View file

@ -0,0 +1,122 @@
title: Kubernetes Network planning with IPv6
---
pub_date: 2021-06-26
---
author: ungleich
---
twitter_handle: ungleich
---
_hidden: no
---
_discoverable: no
---
abstract:
Learn which networks are good to use with kubernetes
---
body:
## Introduction
While IPv6 has a huge address space, you will need to specify a
**podCidr** (the network for the pods) and a **serviceCidr** (the
network for the services) for kubernetes. In this blog article we show
our findings and give a recommendation on what are the "most sensible"
networks to use for kubernetes.
## TL;DR
## Kubernetes limitations
In a typical IPv6 network, you would "just assign a /64" to anything
that needs to be a network. It is a bit the IPv6-no-brainer way of
handling networking.
However, kubernetes has a limitation:
[the serviceCidr cannot be bigger than a /108 at the
moment](https://github.com/kubernetes/kubernetes/pull/90115).
This is something very atypical for the IPv6 world, but nothing we
cannot handle. There are various pull requests and issues to fix this
behaviour on github, some of them listed below:
* https://github.com/kubernetes/enhancements/pull/1534
* https://github.com/kubernetes/kubernetes/pull/79993
* https://github.com/kubernetes/kubernetes/pull/90115 (this one is
quite interesting to read)
That said, it is possible to use a /64 for the **podCidr**.
## The "correct way" without the /108 limitation
If kubernetes did not have this limitation, our recommendation would
be to use one /64 for the podCidr and one /64 for the serviceCidr. If
in the future the limitations of kubernetes have been lifted, skip
reading this article and just use two /64's.
Do not be tempted to suggest making /108's the default, even if they
"have enough space", because using /64's allows you to stay in much
easier network plans.
## Sanity checking the /108
To be able to plan kubernetes clusters, it is important to know where
they should live, especially if you plan having a lot of kubernetes
clusters. Let's have a short look at the /108 network limitation:
A /108 allows 20 bit to be used for generating addresses, or a total
of 1048576 hosts. This is probably enough for the number of services
in a cluster. Now, can we be consistent and also use a /108 for the
podCidr? Let's assume for the moment that we do exactly that, so we
run a maximum of 1048576 pods at the same time. Assuming each service
consumes on average 4 pods, this would allow one to run 262144
services.
Assuming each pod uses around 0.1 CPUs and 100Mi RAM, if all pods were
to run at the same time, you would need ca. 100'000 CPUs and 100 TB
RAM. Assuming further that each node contains at maximum 128 CPUs and
at maximum 1 TB RAM (quite powerful servers), we would need more than
750 servers just for the CPUs.
So we can reason that **we can** run kubernetes clusters of quite some
size even with a **podCidr of /108**.
## Organising /108's
Let's assume that we organise all our kubernetes clusters in a single
/64, like 2001:db8:1:2::/64, which looks like this:
```
% sipcalc 2001:db8:1:2::/64
-[ipv6 : 2001:db8:1:2::/64] - 0
[IPV6 INFO]
Expanded Address - 2001:0db8:0001:0002:0000:0000:0000:0000
Compressed address - 2001:db8:1:2::
Subnet prefix (masked) - 2001:db8:1:2:0:0:0:0/64
Address ID (masked) - 0:0:0:0:0:0:0:0/64
Prefix address - ffff:ffff:ffff:ffff:0:0:0:0
Prefix length - 64
Address type - Aggregatable Global Unicast Addresses
Network range - 2001:0db8:0001:0002:0000:0000:0000:0000 -
2001:0db8:0001:0002:ffff:ffff:ffff:ffff
```
A /108 network on the other hand looks like this:
```
% sipcalc 2001:db8:1:2::/108
-[ipv6 : 2001:db8:1:2::/108] - 0
[IPV6 INFO]
Expanded Address - 2001:0db8:0001:0002:0000:0000:0000:0000
Compressed address - 2001:db8:1:2::
Subnet prefix (masked) - 2001:db8:1:2:0:0:0:0/108
Address ID (masked) - 0:0:0:0:0:0:0:0/108
Prefix address - ffff:ffff:ffff:ffff:ffff:ffff:fff0:0
Prefix length - 108
Address type - Aggregatable Global Unicast Addresses
Network range - 2001:0db8:0001:0002:0000:0000:0000:0000 -
2001:0db8:0001:0002:0000:0000:000f:ffff
```
Assuming for a moment that we assign a /108, this looks as follows: