Merge branch 'master' of code.ungleich.ch:ungleich-public/ungleich-staticcms
This commit is contained in:
		
				commit
				
					
						bc5fc19ca7
					
				
			
		
					 21 changed files with 2216 additions and 2 deletions
				
			
		
							
								
								
									
										
											BIN
										
									
								
								assets/u/image/k8s-v6-v4-dns.png
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										
											BIN
										
									
								
								assets/u/image/k8s-v6-v4-dns.png
									
										
									
									
									
										Normal file
									
								
							
										
											Binary file not shown.
										
									
								
							| 
		 After Width: | Height: | Size: 88 KiB  | 
							
								
								
									
										162
									
								
								content/u/blog/datacenterlight-active-active-routing/contents.lr
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										162
									
								
								content/u/blog/datacenterlight-active-active-routing/contents.lr
									
										
									
									
									
										Normal file
									
								
							| 
						 | 
					@ -0,0 +1,162 @@
 | 
				
			||||||
 | 
					title: Active-Active Routing Paths in Data Center Light
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					pub_date: 2019-11-08
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					author: Nico Schottelius
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					twitter_handle: NicoSchottelius
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_discoverable: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					abstract:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					body:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					From our last two blog articles (a, b) you probably already know that
 | 
				
			||||||
 | 
					it is spring network cleanup in [Data Center Light](https://datacenterlight.ch).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In [first blog article]() we described where we started and in
 | 
				
			||||||
 | 
					the [second blog article]() you could see how we switched our
 | 
				
			||||||
 | 
					infrastructure to IPv6 only netboot.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In this article we will dive a bit more into the details of our
 | 
				
			||||||
 | 
					network architecture and which problems we face with active-active
 | 
				
			||||||
 | 
					routers.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Network architecture
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's have a look at a simplified (!) diagram of the network:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					... IMAGE
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Doesn't look that simple, does it? Let's break it down into small
 | 
				
			||||||
 | 
					pieces.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Upstream routers
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We have a set of **upstream routers** which work stateless. They don't
 | 
				
			||||||
 | 
					have any stateful firewall rules, so both of them can work actively
 | 
				
			||||||
 | 
					without state synchronisation. Moreover, both of them peer with the
 | 
				
			||||||
 | 
					data center upstreams. These are fast routers and besides forwarding,
 | 
				
			||||||
 | 
					they also do **BGP peering** with our upstreams.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Over all the upstream routers are very simple machines, mostly running
 | 
				
			||||||
 | 
					bird and forwarding packets all day. They also provide a DNS service
 | 
				
			||||||
 | 
					(resolving and authoritative), because they are always up and can
 | 
				
			||||||
 | 
					announce service IPs via BGP or via OSPF to our network.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Internal routers
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The internal routers on the other hand provide **stateful routing**,
 | 
				
			||||||
 | 
					**IP address assignments** and **netboot services**. They are a bit
 | 
				
			||||||
 | 
					more complicated compared to the upstream routers, but they care only
 | 
				
			||||||
 | 
					a small routing table.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Communication between the routers
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					All routers employ OSPF and BGP for route exchange. Thus the two
 | 
				
			||||||
 | 
					upstream routers learn about the internal networks (IPv6 only, as
 | 
				
			||||||
 | 
					usual) from the internal routers.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Sessions
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Sessions in networking are almost always an evil. You need to store
 | 
				
			||||||
 | 
					them (at high speed), you need to maintain them (updating, deleting)
 | 
				
			||||||
 | 
					and if you run multiple routers, you even need to sychronise them.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In our case the internal routers do have session handling, as they are
 | 
				
			||||||
 | 
					providing a stateful firewall. As we are using a multi router setup,
 | 
				
			||||||
 | 
					things can go really wrong if the wrong routes are being used.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's have a look at this a bit more in detail.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The good path
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					IMAGE2: good
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If a server sends out a packet via router1 and router1 eventually
 | 
				
			||||||
 | 
					receives the answer, everything is fine. The returning packet matches
 | 
				
			||||||
 | 
					the state entry that was created by the outgoing packet and the
 | 
				
			||||||
 | 
					internal router forwards the packet.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The bad path
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					IMAGE3: bad
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However if the
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Routing paths
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If we want to go active-active routing, the server can choose between
 | 
				
			||||||
 | 
					either internal router for sending out the packet. The internal
 | 
				
			||||||
 | 
					routers again have two upstream routers. So with the return path
 | 
				
			||||||
 | 
					included, the following paths exist for a packet:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Outgoing paths:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* servers->router1->upstream router1->internet
 | 
				
			||||||
 | 
					* servers->router1->upstream router2->internet
 | 
				
			||||||
 | 
					* servers->router2->upstream router1->internet
 | 
				
			||||||
 | 
					* servers->router2->upstream router2->internet
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					And the returning paths are:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* internet->upstream router1->router 1->servers
 | 
				
			||||||
 | 
					* internet->upstream router1->router 2->servers
 | 
				
			||||||
 | 
					* internet->upstream router2->router 1->servers
 | 
				
			||||||
 | 
					* internet->upstream router2->router 2->servers
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So on average, 50% of the routes will hit the right router on
 | 
				
			||||||
 | 
					return. However servers as well as upstream routers are not using load
 | 
				
			||||||
 | 
					balancing like ECMP, so once an incorrect path has been chosen, the
 | 
				
			||||||
 | 
					packet loss is 100%.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Session synchronisation
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In the first article we talked a bit about keepalived and that
 | 
				
			||||||
 | 
					it helps to operate routers in an active-passive mode. This did not
 | 
				
			||||||
 | 
					turn out to be the most reliable method. Can we do better with
 | 
				
			||||||
 | 
					active-active routers and session synchronisation?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Linux supports this using
 | 
				
			||||||
 | 
					[conntrackd](http://conntrack-tools.netfilter.org/). However,
 | 
				
			||||||
 | 
					conntrackd supports active-active routers on a **flow based** level,
 | 
				
			||||||
 | 
					but not on a **packet** based level. The difference is that the
 | 
				
			||||||
 | 
					following will not work in active-active routers with conntrackd:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					#1 Packet (in the original direction) updates state in Router R1 ->
 | 
				
			||||||
 | 
					   submit state to R2
 | 
				
			||||||
 | 
					#2 Packet (in the reply direction) arrive to Router R2 before state
 | 
				
			||||||
 | 
					   coming from R1 has been digested.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					With strict stateful filtering, Packet #2 will be dropped and it will
 | 
				
			||||||
 | 
					trigger a retransmission.
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					(quote from Pablo Neira Ayuso, see below for more details)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Some of you will mumble something like **latency** in their head right
 | 
				
			||||||
 | 
					now. If the return packet is guaranteed to arrive after state
 | 
				
			||||||
 | 
					synchronisation, then everything is fine, However, if the reply is
 | 
				
			||||||
 | 
					faster than the state synchronisation, packets will get dropped.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In reality, this will work for packets coming and going to the
 | 
				
			||||||
 | 
					Internet. However, in our setup the upstream routers are route between
 | 
				
			||||||
 | 
					different data center locations, which are in the sub micro second
 | 
				
			||||||
 | 
					latency area - i.e. lan speed, because they are interconnected with
 | 
				
			||||||
 | 
					dark fiber links.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Take away
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Before moving on to the next blog article, we would like to express
 | 
				
			||||||
 | 
					our thanks to Pablo Neira Ayuso, who gave very important input for
 | 
				
			||||||
 | 
					session based firewalls and session synchronisation.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So active-active routing seems not to have a straight forward
 | 
				
			||||||
 | 
					solution. Read in the [next blog
 | 
				
			||||||
 | 
					article](/u/blog/datacenterlight-redundant-routing-infrastructure) on
 | 
				
			||||||
 | 
					how we solved the challenge in the end.
 | 
				
			||||||
							
								
								
									
										219
									
								
								content/u/blog/datacenterlight-ipv6-only-netboot/contents.lr
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										219
									
								
								content/u/blog/datacenterlight-ipv6-only-netboot/contents.lr
									
										
									
									
									
										Normal file
									
								
							| 
						 | 
					@ -0,0 +1,219 @@
 | 
				
			||||||
 | 
					title: IPv6 only netboot in Data Center Light
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					pub_date: 2021-05-01
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					author: Nico Schottelius
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					twitter_handle: NicoSchottelius
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_discoverable: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					abstract:
 | 
				
			||||||
 | 
					How we switched from IPv4 netboot to IPv6 netboot
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					body:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In our [previous blog
 | 
				
			||||||
 | 
					article](/u/blog/datacenterlight-spring-network-cleanup)
 | 
				
			||||||
 | 
					 we wrote about our motivation for the
 | 
				
			||||||
 | 
					big spring network cleanup. In this blog article we show how we
 | 
				
			||||||
 | 
					started reducing the complexity by removing our dependency on IPv4.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## IPv6 first
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When you found our blog, you are probably aware: everything at
 | 
				
			||||||
 | 
					ungleich is IPv6 first. Many of our networks are IPv6 only, all DNS
 | 
				
			||||||
 | 
					entries for remote access have IPv6 (AAAA) entries and there are only
 | 
				
			||||||
 | 
					rare exceptions when we utilise IPv4 for our infrastructure.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## IPv4 only Netboot
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					One of the big exceptions to this paradigm used to be how we boot our
 | 
				
			||||||
 | 
					servers. Because our second big paradigm is sustainability, we use a
 | 
				
			||||||
 | 
					lot of 2nd (or 3rd) generation hardware. We actually share this
 | 
				
			||||||
 | 
					passion with our friends from
 | 
				
			||||||
 | 
					[e-durable](https://recycled.cloud/), because sustainability is
 | 
				
			||||||
 | 
					something that we need to employ today and not tomorrow.
 | 
				
			||||||
 | 
					But back to the netbooting topic: For netbooting we mainly
 | 
				
			||||||
 | 
					relied on onboard network cards so far.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Onboard network cards
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We used these network cards for multiple reasons:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* they exist virtually in any server
 | 
				
			||||||
 | 
					* they usually have a ROM containing a PXE capable firmware
 | 
				
			||||||
 | 
					* it allows us to split real traffic to fiber cards and internal traffic
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However using the onboard devices comes also with a couple of disadvantages:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Their ROM is often outdated
 | 
				
			||||||
 | 
					* It requires additional cabling
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Cables
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's have a look at the cabling situation first. Virtually all of
 | 
				
			||||||
 | 
					our servers are connected to the network using 2x 10 Gbit/s fiber cards.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					On one side this provides a fast connection, but on the other side
 | 
				
			||||||
 | 
					it provides us with something even better: distances.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Our data centers employ a non-standard design due to the re-use of
 | 
				
			||||||
 | 
					existing factory halls. This means distances between servers and
 | 
				
			||||||
 | 
					switches can be up to 100m. With fiber, we can easily achieve these
 | 
				
			||||||
 | 
					distances.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Additionally, have less cables provides a simpler infrastructure
 | 
				
			||||||
 | 
					that is easier to analyse.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Disabling onboard network cards
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So can we somehow get rid of the copper cables and switch to fiber
 | 
				
			||||||
 | 
					only? It turns out that the fiber cards we use (mainly Intel X520's)
 | 
				
			||||||
 | 
					have their own ROM. So we started disabling the onboard network cards
 | 
				
			||||||
 | 
					and tried booting from the fiber cards. This worked until we wanted to
 | 
				
			||||||
 | 
					move the lab setup to production...
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Bonding (LACP) and VLAN tagging
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Our servers use bonding (802.3ad) for redundant connections to the
 | 
				
			||||||
 | 
					switches and VLAN tagging on top of the bonded devices to isolate
 | 
				
			||||||
 | 
					client traffic. On the switch side we realised this using
 | 
				
			||||||
 | 
					configurations like
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					interface Port-Channel33
 | 
				
			||||||
 | 
					   switchport mode trunk
 | 
				
			||||||
 | 
					   mlag 33
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					...
 | 
				
			||||||
 | 
					interface Ethernet33
 | 
				
			||||||
 | 
					   channel-group 33 mode active
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					But that does not work, if the network ROM at boot does not create an
 | 
				
			||||||
 | 
					LACP enabled link on top of which it should be doing VLAN tagging.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The ROM in our network cards **would** have allowed VLAN tagging alone
 | 
				
			||||||
 | 
					though.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					To fix this problem, we reconfigured our switches as follows:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					interface Port-Channel33
 | 
				
			||||||
 | 
					   switchport trunk native vlan 10
 | 
				
			||||||
 | 
					   switchport mode trunk
 | 
				
			||||||
 | 
					   port-channel lacp fallback static
 | 
				
			||||||
 | 
					   port-channel lacp fallback timeout 20
 | 
				
			||||||
 | 
					   mlag 33
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					This basically does two things:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* If there are no LACP frames, fallback to static (non lacp)
 | 
				
			||||||
 | 
					  configuration
 | 
				
			||||||
 | 
					* Accept untagged traffic and map it to VLAN 10 (one of our boot networks)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Great, our servers can now netboot from fiber! But we are not done
 | 
				
			||||||
 | 
					yet...
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## IPv6 only netbooting
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So how do we convince these network cards to do IPv6 netboot? Can we
 | 
				
			||||||
 | 
					actually do that at all? Our first approach was to put a custom build of
 | 
				
			||||||
 | 
					[ipxe](https://ipxe.org/) on a USB stick. We generated that
 | 
				
			||||||
 | 
					ipxe image using **rebuild-ipxe.sh** script
 | 
				
			||||||
 | 
					from the
 | 
				
			||||||
 | 
					[ungleich-tools](https://code.ungleich.ch/ungleich-public/ungleich-tools)
 | 
				
			||||||
 | 
					repository. Turns out using a USB stick works pretty well for most
 | 
				
			||||||
 | 
					situations.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## ROMs are not ROMs
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					As you can imagine, the ROM of the X520 cards does not contain IPv6
 | 
				
			||||||
 | 
					netboot support. So are we back at square 1? No, we are not. Because
 | 
				
			||||||
 | 
					the X520's have something that the onboard devices did not
 | 
				
			||||||
 | 
					consistently have: **a rewritable memory area**.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's take 2 steps back here first: A ROM is an **read only memory**
 | 
				
			||||||
 | 
					chip. Emphasis on **read only**. However, modern network cards and a
 | 
				
			||||||
 | 
					lot of devices that support on-device firmware do actually have a
 | 
				
			||||||
 | 
					memory (flash) area that can be written to. And that is what aids us
 | 
				
			||||||
 | 
					in our situation.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## ipxe + flbtool + x520 = fun
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Trying to write ipxe into the X520 cards initially failed, because the
 | 
				
			||||||
 | 
					network card did not recognise the format of the ipxe rom file.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Luckily the folks in the ipxe community already spotted that problem
 | 
				
			||||||
 | 
					AND fixed it: The format used in these cards is called FLB. And there
 | 
				
			||||||
 | 
					is [flbtool](https://github.com/devicenull/flbtool/), which allows you
 | 
				
			||||||
 | 
					to wrap the ipxe rom file into the FLB format. For those who want to
 | 
				
			||||||
 | 
					try it yourself (at your own risk!), it basically involves:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Get the current ROM from the card (try bootutil64e)
 | 
				
			||||||
 | 
					* Extract the contents from the rom using flbtool
 | 
				
			||||||
 | 
					* This will output some sections/parts
 | 
				
			||||||
 | 
					* Locate one part that you want to overwrite with iPXE (a previous PXE
 | 
				
			||||||
 | 
					  section is very suitable)
 | 
				
			||||||
 | 
					* Replace the .bin file with your iPXE rom
 | 
				
			||||||
 | 
					* Adjust the .json file to match the length of the new binary
 | 
				
			||||||
 | 
					* Build a new .flb file using flbtool
 | 
				
			||||||
 | 
					* Flash it onto the card
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					While this is a bit of work, it is worth it for us, because...:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## IPv6 only netboot over fiber
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					With the modified ROM, basically loading iPXE at start, we can now
 | 
				
			||||||
 | 
					boot our servers in IPv6 only networks. On our infrastructure side, we
 | 
				
			||||||
 | 
					added two **tiny** things:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We use ISC dhcp with the following configuration file:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					option dhcp6.bootfile-url code 59 = string;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					option dhcp6.bootfile-url "http://[2a0a:e5c0:0:6::46]/ipxescript";
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					subnet6 2a0a:e5c0:0:6::/64 {}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					(that is the complete configuration!)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					And we used radvd to announce that there are other information,
 | 
				
			||||||
 | 
					indicating clients can actually query the dhcpv6 server:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					interface bond0.10
 | 
				
			||||||
 | 
					{
 | 
				
			||||||
 | 
					  AdvSendAdvert on;
 | 
				
			||||||
 | 
					  MinRtrAdvInterval 3;
 | 
				
			||||||
 | 
					  MaxRtrAdvInterval 5;
 | 
				
			||||||
 | 
					  AdvDefaultLifetime 600;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					  # IPv6 netbooting
 | 
				
			||||||
 | 
					  AdvOtherConfigFlag on;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					  prefix 2a0a:e5c0:0:6::/64      { };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					  RDNSS 2a0a:e5c0:0:a::a 2a0a:e5c0:0:a::b  { AdvRDNSSLifetime 6000; };
 | 
				
			||||||
 | 
					  DNSSL place5.ungleich.ch {  AdvDNSSLLifetime 6000; } ;
 | 
				
			||||||
 | 
					};
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Take away
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Being able to reduce cables was one big advantage in the beginning.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Switching to IPv6 only netboot does not seem like a big simplification
 | 
				
			||||||
 | 
					in the first place, besides being able to remove IPv4 in server
 | 
				
			||||||
 | 
					networks.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However as you will see in
 | 
				
			||||||
 | 
					[the next blog posts](/u/blog/datacenterlight-active-active-routing/),
 | 
				
			||||||
 | 
					switching to IPv6 only netbooting is actually a key element on
 | 
				
			||||||
 | 
					reducing complexity in our network.
 | 
				
			||||||
| 
						 | 
					@ -0,0 +1,222 @@
 | 
				
			||||||
 | 
					title: Redundant routing infrastructure at Data Center Light
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					pub_date: 2021-05-01
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					author: Nico Schottelius
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					twitter_handle: NicoSchottelius
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_discoverable: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					abstract:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					body:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In case you have missed the previous articles, you can
 | 
				
			||||||
 | 
					get [an introduction to the Data Center Light spring
 | 
				
			||||||
 | 
					cleanup](/u/blog/datacenterlight-spring-network-cleanup),
 | 
				
			||||||
 | 
					see [how we switched to IPv6 only netboot](/u/blog/datacenterlight-ipv6-only-netboot)
 | 
				
			||||||
 | 
					or read about [the active-active routing
 | 
				
			||||||
 | 
					problems](/u/blog/datacenterlight-active-active-routing/).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In this article we will show how we finally solved the routing issue
 | 
				
			||||||
 | 
					conceptually as well as practically.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Active-active or passive-active routing?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In the [previous blog article](/u/blog/datacenterlight-active-active-routing/)
 | 
				
			||||||
 | 
					we reasoned that active-active routing, even with session
 | 
				
			||||||
 | 
					synchronisation does not have a straight forward solution in our
 | 
				
			||||||
 | 
					case. However in the
 | 
				
			||||||
 | 
					[first blog article](/u/blog/datacenterlight-spring-network-cleanup)
 | 
				
			||||||
 | 
					we reasoned that active-passive routers with VRRP and keepalived are
 | 
				
			||||||
 | 
					not stable enough either
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So which path should we take? Or is there another solution?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Active-Active-Passive Routing
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let us introduce Active-Active-Passive routing. Something that sounds
 | 
				
			||||||
 | 
					strange in the first place, but is going to make sense in the next
 | 
				
			||||||
 | 
					minutes.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We do want multiple active routers, but we do not want to have to
 | 
				
			||||||
 | 
					deal with session synchronisation, which is not only tricky, but due
 | 
				
			||||||
 | 
					to its complexity can also be a source of error.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So what we are looking for is active-active routing without state
 | 
				
			||||||
 | 
					synchronisation. While this sounds like a contradiction, if we loosen
 | 
				
			||||||
 | 
					our requirement a little bit, we are able to support multiple active
 | 
				
			||||||
 | 
					routers without session synchronisation by using **routing
 | 
				
			||||||
 | 
					priorities**.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Active-Active routing with routing priorities
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's assume for a moment that all involved hosts (servers, clients,
 | 
				
			||||||
 | 
					routers, etc.) know about multiple routes for outgoing and incoming
 | 
				
			||||||
 | 
					traffic. Let's assume also for a moment that **we can prioritise**
 | 
				
			||||||
 | 
					those routes. Then we can create a deterministic routing path that
 | 
				
			||||||
 | 
					does not need session synchronisation.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Steering outgoing traffic
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's have a first look at the outgoing traffic. Can we announce
 | 
				
			||||||
 | 
					multiple routers in a network, but have the servers and clients
 | 
				
			||||||
 | 
					**prefer** one of the routers? The answer is yes!
 | 
				
			||||||
 | 
					If we checkout the manpage of
 | 
				
			||||||
 | 
					[radvd.conf(5)](https://linux.die.net/man/5/radvd.conf) we find a
 | 
				
			||||||
 | 
					setting that is named **AdvDefaultPreference**
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					AdvDefaultPreference low|medium|high
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Using this attribute, two routers can both actively announce
 | 
				
			||||||
 | 
					themselves, but clients in the network will prefer the one with the
 | 
				
			||||||
 | 
					higher preference setting.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Replacing radvd with bird
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					At this point a short side note: We have been using radvd for some
 | 
				
			||||||
 | 
					years in the Data Center Light. However recently on our
 | 
				
			||||||
 | 
					[Alpine Linux based routers](https://alpinelinux.org/), radvd started
 | 
				
			||||||
 | 
					to crash from time to time:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					[717424.727125] device eth1 left promiscuous mode
 | 
				
			||||||
 | 
					[1303962.899600] radvd[24196]: segfault at 63f42258 ip 00007f6bdd59353b sp 00007ffc63f421b8 error 4 in ld-musl-x86_64.so.1[7f6bdd558000+48000]
 | 
				
			||||||
 | 
					[1303962.899609] Code: 48 09 c8 4c 85 c8 75 0d 49 83 c4 08 eb d4 39 f0 74 0c 49 ff c4 41 0f b6 04 24 84 c0 75 f0 4c 89 e0 41 5c c3 31 c9 0f b6 04 0f <0f> b6 14 0e 38 d0 75 07 48 ff c1 84 c0 75 ed 29 d0 c3 41 54 49 89
 | 
				
			||||||
 | 
					...
 | 
				
			||||||
 | 
					[1458460.511006] device eth0 entered promiscuous mode
 | 
				
			||||||
 | 
					[1458460.511168] radvd[27905]: segfault at 4dfce818 ip 00007f94ec1fd53b sp 00007ffd4dfce778 error 4 in ld-musl-x86_64.so.1[7f94ec1c2000+48000]
 | 
				
			||||||
 | 
					[1458460.511177] Code: 48 09 c8 4c 85 c8 75 0d 49 83 c4 08 eb d4 39 f0 74 0c 49 ff c4 41 0f b6 04 24 84 c0 75 f0 4c 89 e0 41 5c c3 31 c9 0f b6 04 0f <0f> b6 14 0e 38 d0 75 07 48 ff c1 84 c0 75 ed 29 d0 c3 41 54 49 89
 | 
				
			||||||
 | 
					...
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Unfortunately it seems that either the addresses timed out or that
 | 
				
			||||||
 | 
					radvd was able to send a message de-announcing itself prior to the
 | 
				
			||||||
 | 
					crash, causing all clients to withdraw their addresses. This is
 | 
				
			||||||
 | 
					especially problematic, if you run a [ceph](https://ceph.io/) cluster
 | 
				
			||||||
 | 
					and the servers don't have IP addresses anymore...
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					While we did not yet investigate the full cause of this, we had a very
 | 
				
			||||||
 | 
					easy solution: as all of our routers run
 | 
				
			||||||
 | 
					[bird](https://bird.network.cz/) and it also supports sending router
 | 
				
			||||||
 | 
					advertisements, we replaced radvd with bird. The configuration is
 | 
				
			||||||
 | 
					actually pretty simple:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					protocol radv {
 | 
				
			||||||
 | 
					    # Internal
 | 
				
			||||||
 | 
					    interface "eth1.5" {
 | 
				
			||||||
 | 
					         max ra interval 5;      # Fast failover with more routers
 | 
				
			||||||
 | 
					         other config yes;       # dhcpv6 boot
 | 
				
			||||||
 | 
					         default preference high;
 | 
				
			||||||
 | 
					    };
 | 
				
			||||||
 | 
					    rdnss {
 | 
				
			||||||
 | 
					        lifetime 3600;
 | 
				
			||||||
 | 
					        ns 2a0a:e5c0:0:a::a;
 | 
				
			||||||
 | 
					        ns 2a0a:e5c0:0:a::b;
 | 
				
			||||||
 | 
					    };
 | 
				
			||||||
 | 
					    dnssl {
 | 
				
			||||||
 | 
					        lifetime 3600;
 | 
				
			||||||
 | 
					        domain "place5.ungleich.ch";
 | 
				
			||||||
 | 
					    };
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Steering incoming traffic
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					As the internal and the upstream routers are in the same data center,
 | 
				
			||||||
 | 
					we can use an IGP like OSPF to distribute the routes to the internal
 | 
				
			||||||
 | 
					routers. And OSPF actually has this very neat metric called **cost**.
 | 
				
			||||||
 | 
					So for the router that sets the **default preference high** for the
 | 
				
			||||||
 | 
					outgoing routes, we keep the cost at 10, for the router that
 | 
				
			||||||
 | 
					ses the **default preference low** we set the cost at 20. The actual
 | 
				
			||||||
 | 
					bird configuration on a router looks like this:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					define ospf_cost = 10;
 | 
				
			||||||
 | 
					...
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					protocol ospf v3 ospf6 {
 | 
				
			||||||
 | 
					        instance id 0;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					        ipv6 {
 | 
				
			||||||
 | 
					                import all;
 | 
				
			||||||
 | 
					                export none;
 | 
				
			||||||
 | 
					        };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					        area 0 {
 | 
				
			||||||
 | 
					                interface "eth1.*" {
 | 
				
			||||||
 | 
					                  authentication cryptographic;
 | 
				
			||||||
 | 
					                  password "weshouldhaveremovedthisfortheblogpost";
 | 
				
			||||||
 | 
					                  cost ospf_cost;
 | 
				
			||||||
 | 
					                };
 | 
				
			||||||
 | 
					        };
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Incoming + Outgoing = symmetric paths
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					With both directions under our control, we now have enabled symmetric
 | 
				
			||||||
 | 
					routing in both directions. Thus as long as the first router is alive,
 | 
				
			||||||
 | 
					all traffic will be handled by the first router.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Failover scenario
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In case the first router fails, clients have a low life time of 15
 | 
				
			||||||
 | 
					seconds (3x **max ra interval**)
 | 
				
			||||||
 | 
					for their routes and they will fail over to the 2nd router
 | 
				
			||||||
 | 
					automatically. Existing sessions will not continue to work, but that
 | 
				
			||||||
 | 
					is ok for our setup. When the first router with the higher priority
 | 
				
			||||||
 | 
					comes back, there will be again an interruption, but clients will
 | 
				
			||||||
 | 
					automatically change their paths.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					And so will the upstream routers, as OSPF is a quick protocol that
 | 
				
			||||||
 | 
					updates alive routers and routes.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## IPv6 enables active-active-passive routing architectures
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					At ungleich it almost always comes back to the topic of IPv6, albeit
 | 
				
			||||||
 | 
					for a good reason. You might remember that we claimed in the
 | 
				
			||||||
 | 
					[IPv6 only netboot](/u/blog/datacenterlight-ipv6-only-netboot) article
 | 
				
			||||||
 | 
					that this is reducing complexity? If you look at the above example,
 | 
				
			||||||
 | 
					you might not spot it directly, but going IPv6 only is actually an
 | 
				
			||||||
 | 
					enabler for our setup:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We **only deploy router advertisements** using bird. We are **not using DHCPv4**
 | 
				
			||||||
 | 
					or **IPv4** for accessing our servers. Both routers run a dhcpv6
 | 
				
			||||||
 | 
					service in parallel, with the "boot server" pointing to themselves.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Besides being nice and clean,
 | 
				
			||||||
 | 
					our whole active-active-passive routing setup **would not work with
 | 
				
			||||||
 | 
					IPv4**, because dhcpv4 servers do not have the same functionality to
 | 
				
			||||||
 | 
					provide routing priorities.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Take away
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					You can see that trying to solve one problem ("unreliable redundant
 | 
				
			||||||
 | 
					router setup") entailed a slew of changes, but in the end made our
 | 
				
			||||||
 | 
					infrastructure much simpler:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* No dual stack
 | 
				
			||||||
 | 
					* No private IPv4 addresses
 | 
				
			||||||
 | 
					* No actively communicating keepalived
 | 
				
			||||||
 | 
					* Two daemons less to maintain (keepalived, radvd)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We also avoided complex state synchronisation and deployed only Open
 | 
				
			||||||
 | 
					Source Software to address our problems. Furthermore hardware that
 | 
				
			||||||
 | 
					looked like unusable in modern IPv6 networks can also be upgraded with
 | 
				
			||||||
 | 
					Open Source Software (ipxe) and enables us to provide more sustainable
 | 
				
			||||||
 | 
					infrastructures.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We hope you enjoyed our spring cleanup blog series. The next one will
 | 
				
			||||||
 | 
					be coming, because IT infrastructures always evolve. Until then:
 | 
				
			||||||
 | 
					feel free to [join our Open Soure Chat](https://chat.with.ungleich.ch)
 | 
				
			||||||
 | 
					and join the discussion.
 | 
				
			||||||
| 
						 | 
					@ -0,0 +1,161 @@
 | 
				
			||||||
 | 
					title: Data Center Light: Spring network cleanup
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					pub_date: 2021-05-01
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					author: Nico Schottelius
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					twitter_handle: NicoSchottelius
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_discoverable: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					abstract:
 | 
				
			||||||
 | 
					From today on ungleich offers free, encrypted IPv6 VPNs for hackerspaces
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					body:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Introduction
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Spring is the time for cleanup. Cleanup up your apartment, removing
 | 
				
			||||||
 | 
					dust from the cabinet, letting the light shine through the windows,
 | 
				
			||||||
 | 
					or like in our case: improving the networking situation.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In this article we give an introduction of where we started and what
 | 
				
			||||||
 | 
					the typical setup used to be in our data center.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Best practice
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When we started [Data Center Light](https://datacenterlight.ch) in
 | 
				
			||||||
 | 
					2017, we orientated ourselves at "best practice" for networking. We
 | 
				
			||||||
 | 
					started with IPv6 only networks and used RFC1918 network (10/8) for
 | 
				
			||||||
 | 
					internal IPv4 routing.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					And we started with 2 routers for every network to provide
 | 
				
			||||||
 | 
					redundancy.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Router redundancy
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So what do you do when you have two routers? In the Linux world the
 | 
				
			||||||
 | 
					software [keepalived](https://keepalived.org/)
 | 
				
			||||||
 | 
					is very popular to provide redundant routing
 | 
				
			||||||
 | 
					using the [VRRP protocol](https://en.wikipedia.org/wiki/Virtual_Router_Redundancy_Protocol).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Active-Passive
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					While VRRP is designed to allow multiple (not only two) routers to
 | 
				
			||||||
 | 
					co-exist in a network, its design is basically active-passive: you
 | 
				
			||||||
 | 
					have one active router and n passive routers, in our case 1
 | 
				
			||||||
 | 
					additional.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Keepalived: a closer look
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					A typical keepalived configuration in our network looked like this:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					vrrp_instance router_v4 {
 | 
				
			||||||
 | 
					    interface INTERFACE
 | 
				
			||||||
 | 
					    virtual_router_id 2
 | 
				
			||||||
 | 
					    priority PRIORITY
 | 
				
			||||||
 | 
					    advert_int 1
 | 
				
			||||||
 | 
					    virtual_ipaddress {
 | 
				
			||||||
 | 
					        10.0.0.1/22 dev  eth1.5      # Internal
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					    notify_backup "/usr/local/bin/vrrp_notify_backup.sh"
 | 
				
			||||||
 | 
					    notify_fault "/usr/local/bin/vrrp_notify_fault.sh"
 | 
				
			||||||
 | 
					    notify_master "/usr/local/bin/vrrp_notify_master.sh"
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					vrrp_instance router_v6 {
 | 
				
			||||||
 | 
					    interface INTERFACE
 | 
				
			||||||
 | 
					    virtual_router_id 1
 | 
				
			||||||
 | 
					    priority PRIORITY
 | 
				
			||||||
 | 
					    advert_int 1
 | 
				
			||||||
 | 
					    virtual_ipaddress {
 | 
				
			||||||
 | 
					        2a0a:e5c0:1:8::48/128  dev eth1.8 # Transfer for routing from outside
 | 
				
			||||||
 | 
					        2a0a:e5c0:0:44::7/64  dev bond0.18 # zhaw
 | 
				
			||||||
 | 
					        2a0a:e5c0:2:15::7/64 dev bond0.20 #
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					This is a template that we distribute via [cdist](https:/cdi.st). The
 | 
				
			||||||
 | 
					strings INTERFACE and PRIORITY are replaced via cdist. The interface
 | 
				
			||||||
 | 
					field defines which interface to use for VRRP communication and the
 | 
				
			||||||
 | 
					priority field determines which of the routers is the active one.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So far, so good. However let's have a look at a tiny detail of this
 | 
				
			||||||
 | 
					configuration file:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					    notify_backup "/usr/local/bin/vrrp_notify_backup.sh"
 | 
				
			||||||
 | 
					    notify_fault "/usr/local/bin/vrrp_notify_fault.sh"
 | 
				
			||||||
 | 
					    notify_master "/usr/local/bin/vrrp_notify_master.sh"
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					These three lines basically say: "start something if you are the
 | 
				
			||||||
 | 
					master" and "stop something in case you are not". And why did we do
 | 
				
			||||||
 | 
					this? Because of stateful services.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Stateful services
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					A typical shell script that we would call containes lines like this:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					/etc/init.d/radvd stop
 | 
				
			||||||
 | 
					/etc/init.d/dhcpd stop
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					(or start in the case of the master version)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In earlier days, this even contained openvpn, which was running on our
 | 
				
			||||||
 | 
					first generation router version. But more about OpenVPN later.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The reason why we stopped and started dhcp and radvd is to make
 | 
				
			||||||
 | 
					clients of the network use the active router. We used radvd to provide
 | 
				
			||||||
 | 
					IPv6 addresses as the primary access method to servers.  And we used
 | 
				
			||||||
 | 
					dhcp mainly to allow servers to netboot.  The active router would
 | 
				
			||||||
 | 
					carry state (firewall!) and thus the flow of packets always need to go
 | 
				
			||||||
 | 
					through the active router.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Restarting radvd on a different machine keeps the IPv6 addresses the
 | 
				
			||||||
 | 
					same, as clients assign then themselves using EUI-64. In case of dhcp
 | 
				
			||||||
 | 
					(IPv4) we would have used hardcoded IPv4 addresses using a mapping of
 | 
				
			||||||
 | 
					MAC address to IPv4 address, but we opted out for this. The main
 | 
				
			||||||
 | 
					reason is that dhcp clients re-request their same leas and even if an
 | 
				
			||||||
 | 
					IPv4 addresses changes, it is not really of importance.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					During a failover this would lead to a few seconds interrupt and
 | 
				
			||||||
 | 
					re-establishing sessions. Given that routers are usually rather stable
 | 
				
			||||||
 | 
					and restarting them is not a daily task, we initially accepted this.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Keepalived/VRRP changes
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					One of the more tricky things is changes to keepalived. Because
 | 
				
			||||||
 | 
					keepalived uses the *number of addresses and routes* to verify
 | 
				
			||||||
 | 
					that the received VRRP packet matches its configuration, adding or
 | 
				
			||||||
 | 
					deleting IP addresses and routes, causes a problem:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					While one router was updated, the number of IP addresses or routes is
 | 
				
			||||||
 | 
					different. This causes both routers to ignore the others VRRP messages
 | 
				
			||||||
 | 
					and both routers think they should be the master process.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					This leads to the problem that both routers receive client and outside
 | 
				
			||||||
 | 
					traffic. This causes the firewall (nftables) to not recognise
 | 
				
			||||||
 | 
					returning packets, if they were sent out by router1, but received back
 | 
				
			||||||
 | 
					by router2 and, because nftables is configured *stateful*, will drop
 | 
				
			||||||
 | 
					the returning packet.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However not only changes to the configuration can trigger this
 | 
				
			||||||
 | 
					problem, but also any communication problem between the two
 | 
				
			||||||
 | 
					routers. Since 2017 we experienced it multiple times that keepalived
 | 
				
			||||||
 | 
					was unable to receive or send messages from the other router and thus
 | 
				
			||||||
 | 
					both of them again became the master process.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Take away
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					While in theory keepalived should improve the reliability, in practice
 | 
				
			||||||
 | 
					the number of problems due to double master situations we had, made us
 | 
				
			||||||
 | 
					question whether the keepalived concept is the fitting one for us.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					You can read how we evolved from this setup in
 | 
				
			||||||
 | 
					[the next blog article](/u/blog/datacenterlight-ipv6-only-netboot/).
 | 
				
			||||||
							
								
								
									
										192
									
								
								content/u/blog/glamp-1-2021/contents.lr
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										192
									
								
								content/u/blog/glamp-1-2021/contents.lr
									
										
									
									
									
										Normal file
									
								
							| 
						 | 
					@ -0,0 +1,192 @@
 | 
				
			||||||
 | 
					title: GLAMP #1 2021
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					pub_date: 2021-07-17
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					author: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					twitter_handle: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_discoverable: yes
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					abstract:
 | 
				
			||||||
 | 
					The first un-hack4glarus happens as a camp - Thursday 2021-08-19 to Sunday 2021-08-22.
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					body:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Tl;DR
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Get your tent, connect it to power and 10Gbit/s Internet in the midst
 | 
				
			||||||
 | 
					of the Glarner mountains. Happenening Thursday 2021-08-19 to Sunday 2021-08-22.
 | 
				
			||||||
 | 
					Apply for participation by mail (information at the bottom of the page).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Introduction
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					It has been some time since our
 | 
				
			||||||
 | 
					[last Hack4Glarus](https://hack4glarus.ch) and we have been missing
 | 
				
			||||||
 | 
					all our friends, hackers and participants. At ungleich we have been
 | 
				
			||||||
 | 
					watching the development of the Coronavirus world wide and as you
 | 
				
			||||||
 | 
					might know, we have decided against a Hack4Glarus for this summer, as
 | 
				
			||||||
 | 
					the Hack4Glarus has been an indoor event so far.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## No Hack4Glarus = GLAMP
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However, we want to try a different format that ensures proper
 | 
				
			||||||
 | 
					safety. Instead of an indoor Hack4Glarus in Linthal, we introduce
 | 
				
			||||||
 | 
					the Glarus Camp (or GLAMP in short) to you. An outdoor event with
 | 
				
			||||||
 | 
					sufficient space for distancing. As a camping site we can use the
 | 
				
			||||||
 | 
					surrounding of the Hacking Villa, supported by the Hacking Villa
 | 
				
			||||||
 | 
					facilities.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Compared to the Hack4Glarus, the GLAMP will focus more on
 | 
				
			||||||
 | 
					*relaxation*, *hangout* than being a hackathon. We think times are
 | 
				
			||||||
 | 
					hard enough to give everyone a break.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The setting
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Many of you know the [Hacking Villa](/u/projects/hacking-villa/) in
 | 
				
			||||||
 | 
					Diesbach already. Located just next to the pretty waterfall and the amazing
 | 
				
			||||||
 | 
					Legler Areal. The villa is connected with 10 Gbit/s to the
 | 
				
			||||||
 | 
					[Data Center Light](/u/projects/data-center-light/) and offers a lot
 | 
				
			||||||
 | 
					of fun things to do.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Coronavirus measures beforehand
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					To ensure safety for everyone, we ask everyone attending to provide a
 | 
				
			||||||
 | 
					reasonable proof of not spreading the corona virus with one of the
 | 
				
			||||||
 | 
					following proofs:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* You have been vaccinated
 | 
				
			||||||
 | 
					* You had the corona virus and you are symptom free for at least 14
 | 
				
			||||||
 | 
					  days
 | 
				
			||||||
 | 
					* You have been tested with a PCR test (7 days old at maximum) and the
 | 
				
			||||||
 | 
					  result was negative
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					All participants will be required to take an short antigen test on
 | 
				
			||||||
 | 
					site.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					**Please do not attend if you feel sick for the safety of everyone else.**
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Coronavirus measures on site
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					To keep the space safe on site as well, we ask you to follow these
 | 
				
			||||||
 | 
					rules:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Sleep in your own tent
 | 
				
			||||||
 | 
					* Wear masks inside the Hacking Villa
 | 
				
			||||||
 | 
					  * Especially if you are preparing food shared with others
 | 
				
			||||||
 | 
					* Keep distance and respect others safety wishes
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Hacking Villa Facilities
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Fast Internet (what do you need more?)
 | 
				
			||||||
 | 
					* A shared, open area outside for hacking
 | 
				
			||||||
 | 
					* Toilets and bath room located inside
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## What to bring
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* A tent + sleeping equipment
 | 
				
			||||||
 | 
					* Fun stuff
 | 
				
			||||||
 | 
					  * Your computer
 | 
				
			||||||
 | 
					  * Wifi / IoT / Hacking things
 | 
				
			||||||
 | 
					* If you want wired Internet in your tent: a 15m+ Ethernet cable
 | 
				
			||||||
 | 
					  * WiFi will be provided everywhere
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## What is provided
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Breakfast every morning
 | 
				
			||||||
 | 
					* A place for a tent
 | 
				
			||||||
 | 
					* Power to the tent (Swiss plug)
 | 
				
			||||||
 | 
					* WiFi to the tent
 | 
				
			||||||
 | 
					* Traditional closing event spaghetti
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## What you can find nearby
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* A nearby supermarket (2km) reachable by foot, scooter, bike
 | 
				
			||||||
 | 
					* A waterfall + barbecue place (~400m)
 | 
				
			||||||
 | 
					* Daily attractions such as hacking, hiking, biking, hanging out
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Registration
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					As the space is limited, we can accomodate about 10 tents (roughly 23
 | 
				
			||||||
 | 
					people). To register, send an email to support@ungleich.ch based on
 | 
				
			||||||
 | 
					the following template:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					Subject: GLAMP#1 2021
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					For each person with you (including yourself):
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    Non Coronavirus proof:
 | 
				
			||||||
 | 
					    (see requirements on the glamp page)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    Name(s):
 | 
				
			||||||
 | 
					    (how you want to be called)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    Interests:
 | 
				
			||||||
 | 
					    (will be shown to others at the glamp)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    Skills:
 | 
				
			||||||
 | 
					    (will be shown to others at the glamp)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    Food interests:
 | 
				
			||||||
 | 
					    (we use this for pooling food orders)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    What I would like to do:
 | 
				
			||||||
 | 
					    (will be shown to others at the glamp)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The particaption fee is 70 CHF/person (to be paid on arrival).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Time, Date and Location
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Arrival possible from Wednesday 2021-08-18 16:00
 | 
				
			||||||
 | 
					* GLAMP#1 starts officially on Thursday 2021-08-19, 1000
 | 
				
			||||||
 | 
					* GLAMP#1 closing lunch Sunday 2021-08-22, 1200
 | 
				
			||||||
 | 
					* GLAMP#1 ends officially on to Sunday 2021-08-22, 1400
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Location: [Hacking Villa](/u/projects/hacking-villa/)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## FAQ
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Where do I get Internet?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					It is available everywhere at/around the Hacking Villa via WiFi. For
 | 
				
			||||||
 | 
					cable based Internet bring a 15m+ Ethernet cable.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Where do I get Electricity?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					You'll get electricity directly to the tent. Additionally the shared
 | 
				
			||||||
 | 
					area also has electricity. You can also bring solar panels, if you
 | 
				
			||||||
 | 
					like.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Where do I get food?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Breakfast is provided by us. But what about the rest of the day?
 | 
				
			||||||
 | 
					There are a lot of delivery services available, ranging from Pizza,
 | 
				
			||||||
 | 
					Tibetan, Thai, Swiss (yes!), etc. available.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Nearby are 2 Volg supermarkets, next Coop is in Schwanden, bigger
 | 
				
			||||||
 | 
					Migros in Glarus and very big Coop can be found in Netstal. The Volg
 | 
				
			||||||
 | 
					is reachable by foot, all others are reachable by train or bike.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					There is also a kitchen inside the Hacking Villa for cooking.
 | 
				
			||||||
 | 
					There is also a great barbecue place just next to the waterfall.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### What can I do at the GLAMP?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					There are
 | 
				
			||||||
 | 
					[alot](http://hyperboleandahalf.blogspot.com/2010/04/alot-is-better-than-you-at-everything.html)
 | 
				
			||||||
 | 
					of opportunities at the GLAMP:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					You can ...
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* just relax and hangout
 | 
				
			||||||
 | 
					* hack on project that you post poned for long
 | 
				
			||||||
 | 
					* hike up mountains (up to 3612m! Lower is also possible)
 | 
				
			||||||
 | 
					* meet other hackers
 | 
				
			||||||
 | 
					* explore the biggest water power plant in Europe (Linth Limmern)
 | 
				
			||||||
 | 
					* and much much more!
 | 
				
			||||||
							
								
								
									
										
											BIN
										
									
								
								content/u/blog/glamp-1-2021/diesback-bg-small.jpg
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										
											BIN
										
									
								
								content/u/blog/glamp-1-2021/diesback-bg-small.jpg
									
										
									
									
									
										Normal file
									
								
							
										
											Binary file not shown.
										
									
								
							| 
		 After Width: | Height: | Size: 380 KiB  | 
										
											Binary file not shown.
										
									
								
							| 
		 After Width: | Height: | Size: 167 KiB  | 
| 
						 | 
					@ -0,0 +1,123 @@
 | 
				
			||||||
 | 
					title: Configuring bind to only forward DNS to a specific zone
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					pub_date: 2021-07-25
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					author: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					twitter_handle: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_discoverable: yes
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					abstract:
 | 
				
			||||||
 | 
					Want to use BIND for proxying to another server? This is how you do it.
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					body:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Introduction
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In this article we'll show you an easy solution to host DNS zones on
 | 
				
			||||||
 | 
					IPv6 only or private DNS servers. The method we use here is **DNS
 | 
				
			||||||
 | 
					forwarding** as offered in ISC BIND, but one could also see this as
 | 
				
			||||||
 | 
					**DNS proxying**.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Background
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Sometimes you might have a DNS server that is authoritative for DNS
 | 
				
			||||||
 | 
					data, but is not reachable for all clients. This might be the case for
 | 
				
			||||||
 | 
					instance, if
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* your DNS server is IPv6 only: it won't be directly reachable from
 | 
				
			||||||
 | 
					  the IPv4 Internet
 | 
				
			||||||
 | 
					* your DNS server is running in a private network, either IPv4 or IPv6
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In both cases, you need something that is publicly reachable, to
 | 
				
			||||||
 | 
					enable clients to access the zone, like show in the following picture:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The problem: Forwarding requires recursive queries
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					ISC Bind allows to forward queries to another name server. However to
 | 
				
			||||||
 | 
					do so, it need to be configured to allow handling recursive querying.
 | 
				
			||||||
 | 
					However, if we allow recursive querying by any client, we basically
 | 
				
			||||||
 | 
					create an [Open DNS resolver, which can be quite
 | 
				
			||||||
 | 
					dangerous](https://www.ncsc.gov.ie/emailsfrom/DDoS/DNS/).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The solution
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					ISC Bind by default has a root hints file compiled in, which allows it
 | 
				
			||||||
 | 
					to function as a resolver without any additional configuration
 | 
				
			||||||
 | 
					files. That is great, but not if you want to prevent it to work as
 | 
				
			||||||
 | 
					forwarder as described above. But we can easily fix that problem. Now,
 | 
				
			||||||
 | 
					let's have a look at a real world use case, step-by-step:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Step 1: Global options
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In the first step, we need to set the global to allow recursion from
 | 
				
			||||||
 | 
					anyone, as follows:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					options {
 | 
				
			||||||
 | 
					    directory "/var/cache/bind";
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    listen-on-v6 { any; };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    allow-recursion { ::/0; 0.0.0.0/0; };
 | 
				
			||||||
 | 
					};
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However as mentioned above, this would create an open resolver. To
 | 
				
			||||||
 | 
					prevent this, let's disable the root hints:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Step 2: Disable root hints
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The root hints are served in the root zone, also know as ".". To
 | 
				
			||||||
 | 
					disable it, we give bind an empty file to use:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					zone "." {
 | 
				
			||||||
 | 
					        type hint;
 | 
				
			||||||
 | 
					        file "/dev/null";
 | 
				
			||||||
 | 
					};
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Note: in case you do want to allow recursive function for some
 | 
				
			||||||
 | 
					clients, **you can create multiple DNS views**.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Step 3: The actual DNS file
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In our case, we have a lot of IPv6 only kubernetes clusters, which are
 | 
				
			||||||
 | 
					named `xx.k8s.ooo` and have a world wide rachable CoreDNS server built
 | 
				
			||||||
 | 
					in. In this case, we want to allow the domain c1.k8s.ooo to be world
 | 
				
			||||||
 | 
					reachable, so we configure the dual stack server as follows:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					zone "c1.k8s.ooo"  {
 | 
				
			||||||
 | 
					   type forward;
 | 
				
			||||||
 | 
					   forward only;
 | 
				
			||||||
 | 
					   forwarders { 2a0a:e5c0:2:f::a; };
 | 
				
			||||||
 | 
					};
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Step 4: adjusting the zone file
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In case you are running an IPv6 only server, you need to configure the
 | 
				
			||||||
 | 
					upstream DNS server. In our case this looks as follows:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					; The domain: c1.k8s.ooo
 | 
				
			||||||
 | 
					c1                          NS      kube-dns.kube-system.svc.c1
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					; The IPv6 only DNS server
 | 
				
			||||||
 | 
					kube-dns.kube-system.svc.c1 AAAA    2a0a:e5c0:2:f::a
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					; The forwarding IPv4 server
 | 
				
			||||||
 | 
					kube-dns.kube-system.svc.c1 A       194.5.220.43
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## DNS, IPv6, Kubernetes?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If you are curious to learn more about either of these topics, feel
 | 
				
			||||||
 | 
					[free to join us on our chat](/u/projects/open-chat/).
 | 
				
			||||||
										
											Binary file not shown.
										
									
								
							| 
		 After Width: | Height: | Size: 154 KiB  | 
							
								
								
									
										210
									
								
								content/u/blog/ipv6-link-local-support-in-browsers/contents.lr
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										210
									
								
								content/u/blog/ipv6-link-local-support-in-browsers/contents.lr
									
										
									
									
									
										Normal file
									
								
							| 
						 | 
					@ -0,0 +1,210 @@
 | 
				
			||||||
 | 
					title: Support for IPv6 link local addresses in browsers
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					pub_date: 2021-06-14
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					author: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					twitter_handle: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_discoverable: yes
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					abstract:
 | 
				
			||||||
 | 
					Tracking the progress of browser support for link local addresses
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					body:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Introduction
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Link Local addresses
 | 
				
			||||||
 | 
					([fe80::/10](https://en.wikipedia.org/wiki/Link-local_address)) are
 | 
				
			||||||
 | 
					used for addressing devices in your local subnet. They can be
 | 
				
			||||||
 | 
					automatically generated and using the IPv6 multicast address
 | 
				
			||||||
 | 
					**ff02::1**, all hosts on the local subnet can easily be located.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However browsers like Chrome or Firefox do not support **entering link
 | 
				
			||||||
 | 
					local addresses inside a URL**, which prevents accessing devices
 | 
				
			||||||
 | 
					locally with a browser, for instance for configuring them.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Link local addresses need **zone identifiers** to specify which
 | 
				
			||||||
 | 
					network device to use as an outgoing interface. This is because
 | 
				
			||||||
 | 
					**you have link local addresses on every interface** and your network
 | 
				
			||||||
 | 
					stack does not know on its own, which interface to use. So typically a
 | 
				
			||||||
 | 
					link local address is something on the line of
 | 
				
			||||||
 | 
					**fe80::fae4:e3ff:fee2:37a4%eth0**, where **eth0** is the zone
 | 
				
			||||||
 | 
					identifier.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Them problem is becoming more emphasised, as the world is moving more
 | 
				
			||||||
 | 
					and more  towards **IPv6 only networks**.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					You might not even know the address of your network equipment anymore,
 | 
				
			||||||
 | 
					but you can easily locate iit using the **ff02::1 multicast
 | 
				
			||||||
 | 
					address**. So we need support in browsers, to allow network
 | 
				
			||||||
 | 
					configurations.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Status of implementation
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The main purpose of this document is to track the status of the
 | 
				
			||||||
 | 
					link-local address support in the different browsers and related
 | 
				
			||||||
 | 
					standards. The current status is:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Firefox says whatwg did not define it
 | 
				
			||||||
 | 
					* Whatwg says zone id is intentionally omitted and and reference w3.org
 | 
				
			||||||
 | 
					* w3.org has a longer reasoning, but it basically boils down to
 | 
				
			||||||
 | 
					  "Firefox and chrome don't do it and it's complicated and nobody needs it"
 | 
				
			||||||
 | 
					* Chromium says it seems not to be worth the effort
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Given that chain of events, if either Firefox, Chrome, W3.org or
 | 
				
			||||||
 | 
					Whatwg where to add support for it, it seems likely that the others
 | 
				
			||||||
 | 
					would be following.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## IPv6 link local address support in Firefox
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The progress of IPv6 link local addresses for Firefox is tracked
 | 
				
			||||||
 | 
					on [the mozilla
 | 
				
			||||||
 | 
					bugzilla](https://bugzilla.mozilla.org/show_bug.cgi?id=700999). The
 | 
				
			||||||
 | 
					current situation is that Firefox references to the lack of
 | 
				
			||||||
 | 
					standardisation by whatwg as a reason for not implementing it. Quoting
 | 
				
			||||||
 | 
					Valentin Gosu from the Mozilla team:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					The main reason the zone identifier is not supported in Firefox is
 | 
				
			||||||
 | 
					that parsing URLs is hard.  You'd think we can just pass whatever
 | 
				
			||||||
 | 
					string to the system API and it will work or fail depending on whether
 | 
				
			||||||
 | 
					it's valid or not, but that's not the case. In bug 1199430 for example
 | 
				
			||||||
 | 
					it was apparent that we need to make sure that the hostname string is
 | 
				
			||||||
 | 
					really valid before passing it to the OS.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					I have no reason to oppose zone identifiers in URLs as long as the URL
 | 
				
			||||||
 | 
					spec defines how to parse them.  As such, I encourage you to engage
 | 
				
			||||||
 | 
					with the standard at https://github.com/whatwg/url/issues/392 instead
 | 
				
			||||||
 | 
					of here.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Thank you!
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## IPv6 link local address support in whatwg
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The situation at [whatwg](https://whatwg.org/) is that there is a
 | 
				
			||||||
 | 
					[closed bug report on github](https://github.com/whatwg/url/issues/392)
 | 
				
			||||||
 | 
					and [in the spec it says](https://url.spec.whatwg.org/#concept-ipv6)
 | 
				
			||||||
 | 
					that
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    Support for <zone_id> is intentionally omitted.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					That paragraph links to a bug registered at w3.org (see next chapter).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## IPv6 link local address support at w3.org
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					At [w3.org](https://www.w3.org/) there is a
 | 
				
			||||||
 | 
					bug titled
 | 
				
			||||||
 | 
					[Support IPv6 link-local
 | 
				
			||||||
 | 
					addresses?](https://www.w3.org/Bugs/Public/show_bug.cgi?id=27234#c2)
 | 
				
			||||||
 | 
					that is set to status **RESOLVED WONTFIX**. It is closed basically
 | 
				
			||||||
 | 
					based on the following statement from Ryan Sleevi:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					Yes, we're especially not keen to support these in Chrome and have
 | 
				
			||||||
 | 
					repeatedly decided not to. The platform-specific nature of <zone_id>
 | 
				
			||||||
 | 
					makes it difficult to impossible to validate the well-formedness of
 | 
				
			||||||
 | 
					the URL (see https://tools.ietf.org/html/rfc4007#section-11.2 , as
 | 
				
			||||||
 | 
					referenced in 6874, to fully appreciate this special hell). Even if we
 | 
				
			||||||
 | 
					could reliably parse these (from a URL spec standpoint), it then has
 | 
				
			||||||
 | 
					to be handed 'somewhere', and that opens a new can of worms.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Even 6874 notes how unlikely it is to encounter these in practice -
 | 
				
			||||||
 | 
					  "Thus, URIs including a
 | 
				
			||||||
 | 
					   ZoneID are unlikely to be encountered in HTML documents.  However, if
 | 
				
			||||||
 | 
					   they do (for example, in a diagnostic script coded in HTML), it would
 | 
				
			||||||
 | 
					   be appropriate to treat them exactly as above."
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Note that a 'dumb' parser may not be sufficient, as the Security Considerations of 6874 note:
 | 
				
			||||||
 | 
					  "To limit this risk, implementations MUST NOT allow use of this format
 | 
				
			||||||
 | 
					   except for well-defined usages, such as sending to link-local
 | 
				
			||||||
 | 
					   addresses under prefix fe80::/10.  At the time of writing, this is
 | 
				
			||||||
 | 
					   the only well-defined usage known."
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					And also
 | 
				
			||||||
 | 
					  "An HTTP client, proxy, or other intermediary MUST remove any ZoneID
 | 
				
			||||||
 | 
					   attached to an outgoing URI, as it has only local significance at the
 | 
				
			||||||
 | 
					   sending host."
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					This requires a transformative rewrite of any URLs going out the
 | 
				
			||||||
 | 
					wire. That's pretty substantial. Anne, do you recall the bug talking
 | 
				
			||||||
 | 
					about IP canonicalization (e.g. http://127.0.0.1 vs
 | 
				
			||||||
 | 
					http://[::127.0.0.1] vs http://012345 and friends?) This is
 | 
				
			||||||
 | 
					conceptually a similar issue - except it's explicitly required in the
 | 
				
			||||||
 | 
					context of <zone_id> that the <zone_id> not be emitted.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					There's also the issue that zone_id precludes/requires the use of APIs
 | 
				
			||||||
 | 
					that user agents would otherwise prefer to avoid, in order to
 | 
				
			||||||
 | 
					'properly' handle the zone_id interpretation. For example, Chromium on
 | 
				
			||||||
 | 
					some platforms uses a built in DNS resolver, and so our address lookup
 | 
				
			||||||
 | 
					functions would need to define and support <zone_id>'s and map them to
 | 
				
			||||||
 | 
					system concepts. In doing so, you could end up with weird situations
 | 
				
			||||||
 | 
					where a URL works in Firefox but not Chrome, even though both
 | 
				
			||||||
 | 
					'hypothetically' supported <zone_id>'s, because FF may use an OS
 | 
				
			||||||
 | 
					routine and Chrome may use a built-in routine and they diverge.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Overall, our internal consensus is that <zone_id>'s are bonkers on
 | 
				
			||||||
 | 
					many grounds - the technical ambiguity (and RFC 6874 doesn't really
 | 
				
			||||||
 | 
					resolve the ambiguity as much as it fully owns it and just says
 | 
				
			||||||
 | 
					#YOLOSWAG) - and supporting them would add a lot of complexity for
 | 
				
			||||||
 | 
					what is explicitly and admittedly a limited value use case.
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					This bug references the Mozilla Firefox bug above and
 | 
				
			||||||
 | 
					[RFC3986 (replaced by RFC
 | 
				
			||||||
 | 
					6874)](https://datatracker.ietf.org/doc/html/rfc6874#section-2).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## IPv6 link local address support in Chrome / Chromium
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					On the chrome side there is a
 | 
				
			||||||
 | 
					[huge bug
 | 
				
			||||||
 | 
					report](https://bugs.chromium.org/p/chromium/issues/detail?id=70762)
 | 
				
			||||||
 | 
					which again references a huge number of other bugs that try to request
 | 
				
			||||||
 | 
					IPv6 link local support, too.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The bug was closed by cbentzel@chromium.org stating:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					There are a large number of special cases which are required on core
 | 
				
			||||||
 | 
					networking/navigation/etc. and it does not seem like it is worth the
 | 
				
			||||||
 | 
					up-front and ongoing maintenance costs given that this is a very
 | 
				
			||||||
 | 
					niche - albeit legitimate - need.
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The bug at chromium has been made un-editable so it is basically
 | 
				
			||||||
 | 
					frozen, besides people have added suggestions to the ticket on how to
 | 
				
			||||||
 | 
					solve it.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Work Arounds
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### IPv6 link local connect hack
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Peter has [documented on the IPv6 link local connect
 | 
				
			||||||
 | 
					hack](https://website.peterjin.org/wiki/Snippets:IPv6_link_local_connect_hack)
 | 
				
			||||||
 | 
					to make firefox use **fe90:0:[scope id]:[IP address]** to reach
 | 
				
			||||||
 | 
					**fe80::[IP address]%[scope id]**. Checkout his website for details!
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### IPv6 hack using ip6tables
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Also from Peter is the hint that you can also use newer iptable
 | 
				
			||||||
 | 
					versions to achieve a similar mapping:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					"On modern Linux kernels you can also run
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```ip6tables -t nat -A OUTPUT -d fef0::/64 -j NETMAP --to fe80::/64```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					if you have exactly one outbound interface, so that fef0::1 translates
 | 
				
			||||||
 | 
					to fe80::1"
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Thanks again for the pointer!
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Other resources
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If you are aware of other resources regarding IPv6 link local support
 | 
				
			||||||
 | 
					in browsers, please join the [IPv6.chat](https://IPv6.chat) and let us
 | 
				
			||||||
 | 
					know about it.
 | 
				
			||||||
							
								
								
									
										144
									
								
								content/u/blog/kubernetes-dns-entries-nat64/contents.lr
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										144
									
								
								content/u/blog/kubernetes-dns-entries-nat64/contents.lr
									
										
									
									
									
										Normal file
									
								
							| 
						 | 
					@ -0,0 +1,144 @@
 | 
				
			||||||
 | 
					title: Automatic A and AAAA DNS entries with NAT64 for kubernetes?
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					pub_date: 2021-06-24
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					author: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					twitter_handle: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_discoverable: yes
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					abstract:
 | 
				
			||||||
 | 
					Given a kubernetes cluster and NAT64 - how do you create DNS entries?
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					body:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The DNS kubernetes quiz
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Today our blog entry does not (yet) show a solution, but more a tricky
 | 
				
			||||||
 | 
					quiz on creating DNS entries. The problem to solve is the following:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* How to make every IPv6 only service in kubernetes also IPv4
 | 
				
			||||||
 | 
					  reachable?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's see who can solve it first or the prettiest. Below are some
 | 
				
			||||||
 | 
					thoughts on how to approach this problem.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The situation
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Assume your kubernetes cluster is IPv6 only and all services
 | 
				
			||||||
 | 
					have proper AAAA DNS entries. This allows you
 | 
				
			||||||
 | 
					[to directly receive traffic from the
 | 
				
			||||||
 | 
					Internet](/u/blog/kubernetes-without-ingress/) to
 | 
				
			||||||
 | 
					your kubernetes services.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Now to make that service also IPv4 reachable, we can deploy NAT64
 | 
				
			||||||
 | 
					service that maps an IPv4 address outside the cluster to an IPv6 service
 | 
				
			||||||
 | 
					address inside the cluster:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					A.B.C.D --> 2001:db8::1
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So all traffic to that IPv4 address is converted to IPv6 by the
 | 
				
			||||||
 | 
					external NAT64 translator.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The proxy service
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's say the service running on 2001:db8::1 is named "ipv4-proxy" and
 | 
				
			||||||
 | 
					thus reachable at ipv4-proxy.default.svc.example.com.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					What we want to achieve is to expose every possible service
 | 
				
			||||||
 | 
					inside the cluster **also via IPv4**. For this purpose we have created
 | 
				
			||||||
 | 
					an haproxy container that access *.svc.example.com and forwards it via
 | 
				
			||||||
 | 
					IPv6.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So the actual flow would look like:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					IPv4 client --[ipv4]--> NAT64 -[ipv6]-> proxy service
 | 
				
			||||||
 | 
					                                         |
 | 
				
			||||||
 | 
					                                         |
 | 
				
			||||||
 | 
					                                         v
 | 
				
			||||||
 | 
					IPv6 client ---------------------> kubernetes service
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The DNS dilemma
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					It would be very tempting to create a wildcard DNS entry or to
 | 
				
			||||||
 | 
					configure/patch CoreDNS to also include an A entry for every service
 | 
				
			||||||
 | 
					that is:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					*.svc IN A A.B.C.D
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So essentially all services resolve to the IPv4 address A.B.C.D. That
 | 
				
			||||||
 | 
					however would also influence the kubernetes cluster, as pods
 | 
				
			||||||
 | 
					potentially resolve A entries (not only AAAA) as well.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					As the containers / pods do not have any IPv4 address (nor IPv4
 | 
				
			||||||
 | 
					routing), access to IPv4 is not possible. There are various outcomes
 | 
				
			||||||
 | 
					of this situation:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					1. The software in the container does happy eyeballs and tries both
 | 
				
			||||||
 | 
					   A/AAAA and uses the working IPv6 connection.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					2. The software in the container misbehaves and takes the first record
 | 
				
			||||||
 | 
					   and uses IPv4 (nodejs is known to have or had a broken resolver
 | 
				
			||||||
 | 
					   that did exactly that).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So adding that wildcard might not be the smartest option. And
 | 
				
			||||||
 | 
					additionally it is unclear whether coreDNS would support that.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Alternative automatic DNS entries
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The *.svc names in a kubernetes cluster are special in the sense that
 | 
				
			||||||
 | 
					they are used for connecting internally. What if coreDNS (or any other
 | 
				
			||||||
 | 
					DNS) server would instead of using *.svc, use a second subdomain like
 | 
				
			||||||
 | 
					*abc*.*namespace*.v4andv6.example.com and generate the same AAAA
 | 
				
			||||||
 | 
					record as for the service and a static A record like describe above?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					That could solve the problem. But again, does coreDNS support that?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Automated DNS entries in other zones
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Instead of fully automated creating the entries as above, another
 | 
				
			||||||
 | 
					option would be to specify DNS entries via annotations in a totally
 | 
				
			||||||
 | 
					different zone, if coreDNS was supporting this. So let's say we also
 | 
				
			||||||
 | 
					have control over example.org and we could instruct coreDNS to create
 | 
				
			||||||
 | 
					the following entries automatically with an annotation:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					abc.something.example.org AAAA <same as the service IP>
 | 
				
			||||||
 | 
					abc.something.example.org A    <a static IPv4 address A.B.C.D>
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In theory this might be solved via some scripting, maybe via a DNS
 | 
				
			||||||
 | 
					server like powerDNS?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Alternative solution with BIND
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The bind DNS server, which is not usually deployed in a kubernetes
 | 
				
			||||||
 | 
					cluster, supports **views**. Views enable different replies to the
 | 
				
			||||||
 | 
					same query depending on the source IP address. Thus in theory
 | 
				
			||||||
 | 
					something like that could be done, assuming a secondary zone
 | 
				
			||||||
 | 
					*example.org*:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* If the request comes from the kubernetes cluster, return a CNAME
 | 
				
			||||||
 | 
					  back to example.com.
 | 
				
			||||||
 | 
					* If the request comes from outside the kubernetes cluster, return an
 | 
				
			||||||
 | 
					  A entry with the static IP
 | 
				
			||||||
 | 
					* Unsolved: how to match on the AAAA entries (because we don't CNAME
 | 
				
			||||||
 | 
					  with the added A entry)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Other solution?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					As you can see, mixing the dynamic IP generation and coupling it with
 | 
				
			||||||
 | 
					static DNS entries for IPv4 resolution is not the easiest tasks. If
 | 
				
			||||||
 | 
					you have a smart idea on how to solve this without manually creating
 | 
				
			||||||
 | 
					entries for each and every service,
 | 
				
			||||||
 | 
					[give us a shout!](/u/contact)
 | 
				
			||||||
| 
						 | 
					@ -0,0 +1,227 @@
 | 
				
			||||||
 | 
					title: Making kubernetes kube-dns publicly reachable
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					pub_date: 2021-06-13
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					author: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					twitter_handle: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_discoverable: yes
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					abstract:
 | 
				
			||||||
 | 
					Looking into IPv6 only DNS provided by kubernetes
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					body:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Introduction
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If you have seen our
 | 
				
			||||||
 | 
					[article about running kubernetes
 | 
				
			||||||
 | 
					Ingress-less](/u/blog/kubernetes-without-ingress/), you are aware that
 | 
				
			||||||
 | 
					we are pushing IPv6 only kubernetes clusters at ungleich.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Today, we are looking at making the "internal" kube-dns service world
 | 
				
			||||||
 | 
					reachable using IPv6 and global DNS servers.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The kubernetes DNS service
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If you have a look at your typical k8s cluster, you will notice that
 | 
				
			||||||
 | 
					you usually have two coredns pods running:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					% kubectl -n kube-system get pods -l k8s-app=kube-dns
 | 
				
			||||||
 | 
					NAME                       READY   STATUS    RESTARTS   AGE
 | 
				
			||||||
 | 
					coredns-558bd4d5db-gz5c7   1/1     Running   0          6d
 | 
				
			||||||
 | 
					coredns-558bd4d5db-hrzhz   1/1     Running   0          6d
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					These pods are usually served by the **kube-dns** service:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					% kubectl -n kube-system get svc -l k8s-app=kube-dns
 | 
				
			||||||
 | 
					NAME       TYPE        CLUSTER-IP           EXTERNAL-IP   PORT(S)                  AGE
 | 
				
			||||||
 | 
					kube-dns   ClusterIP   2a0a:e5c0:13:e2::a   <none>        53/UDP,53/TCP,9153/TCP   6d1h
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					As you can see, the kube-dns service is running on a publicly
 | 
				
			||||||
 | 
					reachable IPv6 address.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## IPv6 only DNS
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					IPv6 only DNS servers have one drawback: they cannot be reached via DNS
 | 
				
			||||||
 | 
					recursions, if the resolver is IPv4 only.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					At [ungleich we run a variety of
 | 
				
			||||||
 | 
					services](https://redmine.ungleich.ch/projects/open-infrastructure/wiki)
 | 
				
			||||||
 | 
					to make IPv6 only services usable in the real world. In case of DNS,
 | 
				
			||||||
 | 
					we are using **DNS forwarders**. They are acting similar to HTTP
 | 
				
			||||||
 | 
					proxies, but for DNS.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So in our main DNS servers, dns1.ungleich.ch, dns2.ungleich.ch
 | 
				
			||||||
 | 
					and dns3.ungleich.ch we have added the following configuration:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					zone "k8s.place7.ungleich.ch"  {
 | 
				
			||||||
 | 
					   type forward;
 | 
				
			||||||
 | 
					   forward only;
 | 
				
			||||||
 | 
					   forwarders { 2a0a:e5c0:13:e2::a; };
 | 
				
			||||||
 | 
					};
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					This tells the DNS servers to forward DNS queries that come in for
 | 
				
			||||||
 | 
					k8s.place7.ungleich.ch to **2a0a:e5c0:13:e2::a**.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Additionally we have added **DNS delegation** in the
 | 
				
			||||||
 | 
					place7.ungleich.ch zone:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					k8s NS dns1.ungleich.ch.
 | 
				
			||||||
 | 
					k8s NS dns2.ungleich.ch.
 | 
				
			||||||
 | 
					k8s NS dns3.ungleich.ch.
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Using the kubernetes DNS service in the wild
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					With this configuration, we can now access IPv6 only
 | 
				
			||||||
 | 
					kubernetes services directly from the Internet. Let's first discover
 | 
				
			||||||
 | 
					the kube-dns service itself:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					% dig kube-dns.kube-system.svc.k8s.place7.ungleich.ch. aaaa
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					; <<>> DiG 9.16.16 <<>> kube-dns.kube-system.svc.k8s.place7.ungleich.ch. aaaa
 | 
				
			||||||
 | 
					;; global options: +cmd
 | 
				
			||||||
 | 
					;; Got answer:
 | 
				
			||||||
 | 
					;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23274
 | 
				
			||||||
 | 
					;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					;; OPT PSEUDOSECTION:
 | 
				
			||||||
 | 
					; EDNS: version: 0, flags:; udp: 4096
 | 
				
			||||||
 | 
					; COOKIE: f61925944f5218c9ac21e43960c64f254792e60f2b10f3f5 (good)
 | 
				
			||||||
 | 
					;; QUESTION SECTION:
 | 
				
			||||||
 | 
					;kube-dns.kube-system.svc.k8s.place7.ungleich.ch. IN AAAA
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					;; ANSWER SECTION:
 | 
				
			||||||
 | 
					kube-dns.kube-system.svc.k8s.place7.ungleich.ch. 27 IN AAAA 2a0a:e5c0:13:e2::a
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					;; AUTHORITY SECTION:
 | 
				
			||||||
 | 
					k8s.place7.ungleich.ch.	13	IN	NS	kube-dns.kube-system.svc.k8s.place7.ungleich.ch.
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					As you can see, the **kube-dns** service in the **kube-system**
 | 
				
			||||||
 | 
					namespace resolves to 2a0a:e5c0:13:e2::a, which is exactly what we
 | 
				
			||||||
 | 
					have configured.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					At the moment, there is also an etherpad test service
 | 
				
			||||||
 | 
					named "ungleich-etherpad" running:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					% kubectl get svc -l app=ungleichetherpad
 | 
				
			||||||
 | 
					NAME                TYPE        CLUSTER-IP              EXTERNAL-IP   PORT(S)    AGE
 | 
				
			||||||
 | 
					ungleich-etherpad   ClusterIP   2a0a:e5c0:13:e2::b7db   <none>        9001/TCP   3d19h
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's first verify that it resolves:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					% dig +short ungleich-etherpad.default.svc.k8s.place7.ungleich.ch aaaa
 | 
				
			||||||
 | 
					2a0a:e5c0:13:e2::b7db
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					And if that works, well, then we should also be able to access the
 | 
				
			||||||
 | 
					service itself!
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					% curl -I http://ungleich-etherpad.default.svc.k8s.place7.ungleich.ch:9001/
 | 
				
			||||||
 | 
					HTTP/1.1 200 OK
 | 
				
			||||||
 | 
					X-Powered-By: Express
 | 
				
			||||||
 | 
					X-UA-Compatible: IE=Edge,chrome=1
 | 
				
			||||||
 | 
					Referrer-Policy: same-origin
 | 
				
			||||||
 | 
					Content-Type: text/html; charset=utf-8
 | 
				
			||||||
 | 
					Content-Length: 6039
 | 
				
			||||||
 | 
					ETag: W/"1797-Dq3+mr7XP0PQshikMNRpm5RSkGA"
 | 
				
			||||||
 | 
					Set-Cookie: express_sid=s%3AZGKdDe3FN1v5UPcS-7rsZW7CeloPrQ7p.VaL1V0M4780TBm8bT9hPVQMWPX5Lcte%2BzotO9Lsejlk; Path=/; HttpOnly; SameSite=Lax
 | 
				
			||||||
 | 
					Date: Sun, 13 Jun 2021 18:36:23 GMT
 | 
				
			||||||
 | 
					Connection: keep-alive
 | 
				
			||||||
 | 
					Keep-Alive: timeout=5
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					(attention, this is a test service and might not be running when you
 | 
				
			||||||
 | 
					read this article at a later time)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## IPv6 vs. IPv4
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Could we have achived the same with IPv4? The answere here is "maybe":
 | 
				
			||||||
 | 
					If the kubernetes service is reachable from globally reachable
 | 
				
			||||||
 | 
					nameservers via IPv4, then the answer is yes. This could be done via
 | 
				
			||||||
 | 
					public IPv4 addresses in the kubernetes cluster, via tunnels, VPNs,
 | 
				
			||||||
 | 
					etc.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However, generally speaking, the DNS service of a
 | 
				
			||||||
 | 
					kubernetes cluster running on RFC1918 IP addresses, is probably not
 | 
				
			||||||
 | 
					reachable from globally reachable DNS servers by default.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					For IPv6 the case is a bit different: we are using globally reachable
 | 
				
			||||||
 | 
					IPv6 addresses in our k8s clusters, so they can potentially be
 | 
				
			||||||
 | 
					reachable without the need of any tunnel or whatsoever. Firewalling
 | 
				
			||||||
 | 
					and network policies can obviously prevent access, but if the IP
 | 
				
			||||||
 | 
					addresses are properly routed, they will be accessible from the public
 | 
				
			||||||
 | 
					Internet.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					And this makes things much easier for DNS servers, which are also
 | 
				
			||||||
 | 
					having IPv6 connectivity.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The following pictures shows the practical difference between the two
 | 
				
			||||||
 | 
					approaches:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Does this make sense?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					That clearly depends on your use-case. If you want your service DNS
 | 
				
			||||||
 | 
					records to be publicly accessible, then the clear answer is yes.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If your cluster services are intended to be internal only
 | 
				
			||||||
 | 
					(see [previous blog post](/u/blog/kubernetes-without-ingress/), then
 | 
				
			||||||
 | 
					exposing the DNS service to the world might not be the best option.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Note on security
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					CoreDNS inside kubernetes is by default configured to allow resolving
 | 
				
			||||||
 | 
					for *any* client that can reach it. Thus if you make your kube-dns
 | 
				
			||||||
 | 
					service world reachable, you also turn it into an open resolver.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					At the time of writing this blog article, the following coredns
 | 
				
			||||||
 | 
					configuration **does NOT** correctly block requests:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					  Corefile: |
 | 
				
			||||||
 | 
					    .:53 {
 | 
				
			||||||
 | 
					        acl k8s.place7.ungleich.ch {
 | 
				
			||||||
 | 
					             allow net ::/0
 | 
				
			||||||
 | 
					        }
 | 
				
			||||||
 | 
					        acl . {
 | 
				
			||||||
 | 
					              allow net 2a0a:e5c0:13::/48
 | 
				
			||||||
 | 
					              block
 | 
				
			||||||
 | 
					        }
 | 
				
			||||||
 | 
					        forward . /etc/resolv.conf {
 | 
				
			||||||
 | 
					           max_concurrent 1000
 | 
				
			||||||
 | 
					        }
 | 
				
			||||||
 | 
					...
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Until this is solved, we recommend to place a firewall before your
 | 
				
			||||||
 | 
					public kube-dns service to only allow requests from the forwarding DNS
 | 
				
			||||||
 | 
					servers.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## More of this
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We are discussing
 | 
				
			||||||
 | 
					kubernetes and IPv6 related topics in
 | 
				
			||||||
 | 
					**the #hacking:ungleich.ch Matrix channel**
 | 
				
			||||||
 | 
					([you can signup here if you don't have an
 | 
				
			||||||
 | 
					account](https://chat.with.ungleich.ch)) and will post more about our
 | 
				
			||||||
 | 
					k8s journey in this blog. Stay tuned!
 | 
				
			||||||
							
								
								
									
										122
									
								
								content/u/blog/kubernetes-network-planning-with-ipv6/contents.lr
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										122
									
								
								content/u/blog/kubernetes-network-planning-with-ipv6/contents.lr
									
										
									
									
									
										Normal file
									
								
							| 
						 | 
					@ -0,0 +1,122 @@
 | 
				
			||||||
 | 
					title: Kubernetes Network planning with IPv6
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					pub_date: 2021-06-26
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					author: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					twitter_handle: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_discoverable: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					abstract:
 | 
				
			||||||
 | 
					Learn which networks are good to use with kubernetes
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					body:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Introduction
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					While IPv6 has a huge address space, you will need to specify a
 | 
				
			||||||
 | 
					**podCidr** (the network for the pods) and a **serviceCidr** (the
 | 
				
			||||||
 | 
					network for the services) for kubernetes. In this blog article we show
 | 
				
			||||||
 | 
					our findings and give a recommendation on what are the "most sensible"
 | 
				
			||||||
 | 
					networks to use for kubernetes.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## TL;DR
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Kubernetes limitations
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In a typical IPv6 network, you would "just assign a /64" to anything
 | 
				
			||||||
 | 
					that needs to be a network. It is a bit the IPv6-no-brainer way of
 | 
				
			||||||
 | 
					handling networking.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However, kubernetes has a limitation:
 | 
				
			||||||
 | 
					[the serviceCidr cannot be bigger than a /108 at the
 | 
				
			||||||
 | 
					moment](https://github.com/kubernetes/kubernetes/pull/90115).
 | 
				
			||||||
 | 
					This is something very atypical for the IPv6 world, but nothing we
 | 
				
			||||||
 | 
					cannot handle. There are various pull requests and issues to fix this
 | 
				
			||||||
 | 
					behaviour on github, some of them listed below:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* https://github.com/kubernetes/enhancements/pull/1534
 | 
				
			||||||
 | 
					* https://github.com/kubernetes/kubernetes/pull/79993
 | 
				
			||||||
 | 
					* https://github.com/kubernetes/kubernetes/pull/90115 (this one is
 | 
				
			||||||
 | 
					  quite interesting to read)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					That said, it is possible to use a /64 for the **podCidr**.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The "correct way" without the /108 limitation
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If kubernetes did not have this limitation, our recommendation would
 | 
				
			||||||
 | 
					be to use one /64 for the podCidr and one /64 for the serviceCidr. If
 | 
				
			||||||
 | 
					in the future the limitations of kubernetes have been lifted, skip
 | 
				
			||||||
 | 
					reading this article and just use two /64's.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Do not be tempted to suggest making /108's the default, even if they
 | 
				
			||||||
 | 
					"have enough space", because using /64's allows you to stay in much
 | 
				
			||||||
 | 
					easier network plans.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Sanity checking the /108
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					To be able to plan kubernetes clusters, it is important to know where
 | 
				
			||||||
 | 
					they should live, especially if you plan having a lot of kubernetes
 | 
				
			||||||
 | 
					clusters. Let's have a short look at the /108 network limitation:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					A /108 allows 20 bit to be used for generating addresses, or a total
 | 
				
			||||||
 | 
					of 1048576 hosts. This is probably enough for the number of services
 | 
				
			||||||
 | 
					in a cluster. Now, can we be consistent and also use a /108 for the
 | 
				
			||||||
 | 
					podCidr? Let's assume for the moment that we do exactly that, so we
 | 
				
			||||||
 | 
					run a maximum of 1048576 pods at the same time. Assuming each service
 | 
				
			||||||
 | 
					consumes on average 4 pods, this would allow one to run 262144
 | 
				
			||||||
 | 
					services.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Assuming each pod uses around 0.1 CPUs and 100Mi RAM, if all pods were
 | 
				
			||||||
 | 
					to run at the same time, you would need ca. 100'000 CPUs and 100 TB
 | 
				
			||||||
 | 
					RAM. Assuming further that each node contains at maximum 128 CPUs and
 | 
				
			||||||
 | 
					at maximum 1 TB RAM (quite powerful servers), we would need more than
 | 
				
			||||||
 | 
					750 servers just for the CPUs.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So we can reason that **we can** run kubernetes clusters of quite some
 | 
				
			||||||
 | 
					size even with a **podCidr of /108**.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Organising /108's
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's assume that we organise all our kubernetes clusters in a single
 | 
				
			||||||
 | 
					/64, like 2001:db8:1:2::/64, which looks like this:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					% sipcalc 2001:db8:1:2::/64
 | 
				
			||||||
 | 
					-[ipv6 : 2001:db8:1:2::/64] - 0
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					[IPV6 INFO]
 | 
				
			||||||
 | 
					Expanded Address	- 2001:0db8:0001:0002:0000:0000:0000:0000
 | 
				
			||||||
 | 
					Compressed address	- 2001:db8:1:2::
 | 
				
			||||||
 | 
					Subnet prefix (masked)	- 2001:db8:1:2:0:0:0:0/64
 | 
				
			||||||
 | 
					Address ID (masked)	- 0:0:0:0:0:0:0:0/64
 | 
				
			||||||
 | 
					Prefix address		- ffff:ffff:ffff:ffff:0:0:0:0
 | 
				
			||||||
 | 
					Prefix length		- 64
 | 
				
			||||||
 | 
					Address type		- Aggregatable Global Unicast Addresses
 | 
				
			||||||
 | 
					Network range		- 2001:0db8:0001:0002:0000:0000:0000:0000 -
 | 
				
			||||||
 | 
								  2001:0db8:0001:0002:ffff:ffff:ffff:ffff
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					A /108 network on the other hand looks like this:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					% sipcalc 2001:db8:1:2::/108
 | 
				
			||||||
 | 
					-[ipv6 : 2001:db8:1:2::/108] - 0
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					[IPV6 INFO]
 | 
				
			||||||
 | 
					Expanded Address	- 2001:0db8:0001:0002:0000:0000:0000:0000
 | 
				
			||||||
 | 
					Compressed address	- 2001:db8:1:2::
 | 
				
			||||||
 | 
					Subnet prefix (masked)	- 2001:db8:1:2:0:0:0:0/108
 | 
				
			||||||
 | 
					Address ID (masked)	- 0:0:0:0:0:0:0:0/108
 | 
				
			||||||
 | 
					Prefix address		- ffff:ffff:ffff:ffff:ffff:ffff:fff0:0
 | 
				
			||||||
 | 
					Prefix length		- 108
 | 
				
			||||||
 | 
					Address type		- Aggregatable Global Unicast Addresses
 | 
				
			||||||
 | 
					Network range		- 2001:0db8:0001:0002:0000:0000:0000:0000 -
 | 
				
			||||||
 | 
								  2001:0db8:0001:0002:0000:0000:000f:ffff
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Assuming for a moment that we assign a /108, this looks as follows:
 | 
				
			||||||
							
								
								
									
										70
									
								
								content/u/blog/kubernetes-production-cluster-1/contents.lr
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										70
									
								
								content/u/blog/kubernetes-production-cluster-1/contents.lr
									
										
									
									
									
										Normal file
									
								
							| 
						 | 
					@ -0,0 +1,70 @@
 | 
				
			||||||
 | 
					title: ungleich production cluster #1
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					pub_date: 2021-07-05
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					author: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					twitter_handle: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_discoverable: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					abstract:
 | 
				
			||||||
 | 
					In this blog article we describe our way to our first production
 | 
				
			||||||
 | 
					kubernetes cluster.
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					body:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Introduction
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					This article is WIP to describe all steps required for our first
 | 
				
			||||||
 | 
					production kubernetes cluster and the services that we run in it.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Setup
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Bootstrapping
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* All nodes are running [Alpine Linux](https://alpinelinux.org)
 | 
				
			||||||
 | 
					* All nodes are configured using [cdist](https://cdi.st)
 | 
				
			||||||
 | 
					  * Mainly installing kubeadm, kubectl, crio *and* docker
 | 
				
			||||||
 | 
					  * At the moment we try to use crio
 | 
				
			||||||
 | 
					* The cluster is initalised using **kubeadm init --config
 | 
				
			||||||
 | 
					  k8s/c2/kubeadm.yaml** from the [ungleich-k8s repo](https://code.ungleich.ch/ungleich-public/ungleich-k8s)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### CNI/Networking
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Calico is installed using **kubectl apply -f
 | 
				
			||||||
 | 
					  cni-calico/calico.yaml** from the [ungleich-k8s
 | 
				
			||||||
 | 
					  repo](https://code.ungleich.ch/ungleich-public/ungleich-k8s)
 | 
				
			||||||
 | 
					* Installing calicoctl using **kubectl apply -f
 | 
				
			||||||
 | 
					  https://docs.projectcalico.org/manifests/calicoctl.yaml**
 | 
				
			||||||
 | 
					* Aliasing calicoctl: **alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl"**
 | 
				
			||||||
 | 
					* All nodes BGP peer with our infrastructure using **calicoctl create -f - < cni-calico/bgp-c2.yaml**
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Persistent Volume Claim support
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Provided by rook
 | 
				
			||||||
 | 
					* Using customized manifests to support IPv6 from ungleich-k8s
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					for yaml in crds common operator cluster storageclass-cephfs storageclass-rbd toolbox; do
 | 
				
			||||||
 | 
					    kubectl apply -f ${yaml}.yaml
 | 
				
			||||||
 | 
					done
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Flux
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Starting with the 2nd cluster?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Follow up
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If you are interesting in continuing the discussion,
 | 
				
			||||||
 | 
					we are there for you in
 | 
				
			||||||
 | 
					**the #kubernetes:ungleich.ch Matrix channel**
 | 
				
			||||||
 | 
					[you can signup here if you don't have an
 | 
				
			||||||
 | 
					account](https://chat.with.ungleich.ch).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Or if you are interested in an IPv6 only kubernetes cluster,
 | 
				
			||||||
 | 
					drop a mail to **support**-at-**ungleich.ch**.
 | 
				
			||||||
							
								
								
									
										201
									
								
								content/u/blog/kubernetes-without-ingress/contents.lr
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										201
									
								
								content/u/blog/kubernetes-without-ingress/contents.lr
									
										
									
									
									
										Normal file
									
								
							| 
						 | 
					@ -0,0 +1,201 @@
 | 
				
			||||||
 | 
					title: Building Ingress-less Kubernetes Clusters
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					pub_date: 2021-06-09
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					author: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					twitter_handle: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_discoverable: yes
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					abstract:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					body:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Introduction
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					On [our journey to build and define IPv6 only kubernetes
 | 
				
			||||||
 | 
					clusters](https://www.nico.schottelius.org/blog/k8s-ipv6-only-cluster/)
 | 
				
			||||||
 | 
					we came accross some principles that seem awkward in the IPv6 only
 | 
				
			||||||
 | 
					world. Let us today have a look at the *LoadBalancer* and *Ingress*
 | 
				
			||||||
 | 
					concepts.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Ingress
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's have a look at the [Ingress
 | 
				
			||||||
 | 
					definition](https://kubernetes.io/docs/concepts/services-networking/ingress/)
 | 
				
			||||||
 | 
					definiton from the kubernetes website:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					Ingress exposes HTTP and HTTPS routes from outside the cluster to
 | 
				
			||||||
 | 
					services within the cluster. Traffic routing is controlled by rules
 | 
				
			||||||
 | 
					defined on the Ingress resource.
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So the ingress basically routes from outside to inside. But, in the
 | 
				
			||||||
 | 
					IPv6 world, services are already publicly reachable. It just
 | 
				
			||||||
 | 
					depends on your network policy.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Update 2021-06-13: Ingress vs. Service
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					As some people pointed out (thanks a lot!), a public service is
 | 
				
			||||||
 | 
					**not the same** as an Ingress. Ingress has also the possibility to
 | 
				
			||||||
 | 
					route based on layer 7 information like the path, domain name, etc.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However, if all of the traffic from an Ingress points to a single
 | 
				
			||||||
 | 
					IPv6 HTTP/HTTPS Service, effectively the IPv6 service will do the
 | 
				
			||||||
 | 
					same, with one hop less.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Services
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's have a look at how services in IPv6 only clusters look like:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					% kubectl get svc
 | 
				
			||||||
 | 
					NAME                                      TYPE        CLUSTER-IP              EXTERNAL-IP   PORT(S)                      AGE
 | 
				
			||||||
 | 
					etherpad                                  ClusterIP   2a0a:e5c0:13:e2::a94b   <none>        9001/TCP                     19h
 | 
				
			||||||
 | 
					nginx-service                             ClusterIP   2a0a:e5c0:13:e2::3607   <none>        80/TCP                       43h
 | 
				
			||||||
 | 
					postgres                                  ClusterIP   2a0a:e5c0:13:e2::c9e0   <none>        5432/TCP                     19h
 | 
				
			||||||
 | 
					...
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					All these services are world reachable, depending on your network
 | 
				
			||||||
 | 
					policy.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## ServiceTypes
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					While we are at looking at the k8s primitives, let's have a closer
 | 
				
			||||||
 | 
					look at the **Service**, specifically at 3 of the **ServiceTypes**
 | 
				
			||||||
 | 
					supported by k8s, including it's definition:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### ClusterIP
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The k8s website says
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					Exposes the Service on a cluster-internal IP. Choosing this value
 | 
				
			||||||
 | 
					makes the Service only reachable from within the cluster. This is the
 | 
				
			||||||
 | 
					default ServiceType.
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So in the context of IPv6, this sounds wrong. There is nothing that
 | 
				
			||||||
 | 
					makes an global IPv6 address be "internal", besides possible network
 | 
				
			||||||
 | 
					policies. The concept is probably coming from the strict difference of
 | 
				
			||||||
 | 
					RFC1918 space usually used in k8s clusters and not public IPv4.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					This difference does not make a lot of sense in the IPv6 world though.
 | 
				
			||||||
 | 
					Seeing **services as public by default**, makes much more sense.
 | 
				
			||||||
 | 
					And simplifies your clusters a lot.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### NodePort
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's first have a look at the definition again:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					Exposes the Service on each Node's IP at a static port (the
 | 
				
			||||||
 | 
					NodePort). A ClusterIP Service, to which the NodePort Service routes,
 | 
				
			||||||
 | 
					is automatically created. You'll be able to contact the NodePort
 | 
				
			||||||
 | 
					Service, from outside the cluster, by requesting <NodeIP>:<NodePort>.
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Conceptually this can be similarily utilised in the IPv6 only world
 | 
				
			||||||
 | 
					like it does in the IPv4 world. However given that there are enough
 | 
				
			||||||
 | 
					addresses available with IPv6, this might not be such an interesting
 | 
				
			||||||
 | 
					ServiceType anymore.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### LoadBalancer
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Before we have a look at this type, let's take some steps back
 | 
				
			||||||
 | 
					first to ...
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## ... Load Balancing
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					There are a variety of possibilities to do load balancing. From simple
 | 
				
			||||||
 | 
					round robin, to ECMP based load balancing, to application aware,
 | 
				
			||||||
 | 
					potentially weighted load balancing.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So for load balancing, there is usually more than one solution and
 | 
				
			||||||
 | 
					there is likely not one size fits all.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So with this said, let.s have a look at the
 | 
				
			||||||
 | 
					**ServiceType LoadBalancer** definition:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					Exposes the Service externally using a cloud provider's load
 | 
				
			||||||
 | 
					balancer. NodePort and ClusterIP Services, to which the external load
 | 
				
			||||||
 | 
					balancer routes, are automatically created.
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So whatever the cloud provider offers, can be used, and that is a good
 | 
				
			||||||
 | 
					thing. However, let's have a look at how you get load balancing for
 | 
				
			||||||
 | 
					free in IPv6 only clusters:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Load Balancing in IPv6 only clusters
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So what is the most easy way of reliable load balancing in network?
 | 
				
			||||||
 | 
					[ECMP (equal cost multi path)](https://en.wikipedia.org/wiki/Equal-cost_multi-path_routing)
 | 
				
			||||||
 | 
					comes to the mind right away. Given that
 | 
				
			||||||
 | 
					kubernetes nodes can BGP peer with the network (upstream or the
 | 
				
			||||||
 | 
					switches), this basically gives load balancing to the world for free:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					                            [ The Internet ]
 | 
				
			||||||
 | 
					                                   |
 | 
				
			||||||
 | 
					     [ k8s-node-1 ]-----------[ network ]-----------[ k8s-node-n]
 | 
				
			||||||
 | 
					                              [  ECMP   ]
 | 
				
			||||||
 | 
					                                   |
 | 
				
			||||||
 | 
					                             [ k8s-node-2]
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In the real world on a bird based BGP upstream router
 | 
				
			||||||
 | 
					this looks as follows:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					[18:13:02] red.place7:~# birdc show route
 | 
				
			||||||
 | 
					BIRD 2.0.7 ready.
 | 
				
			||||||
 | 
					Table master6:
 | 
				
			||||||
 | 
					...
 | 
				
			||||||
 | 
					2a0a:e5c0:13:e2::/108 unicast [place7-server1 2021-06-07] * (100) [AS65534i]
 | 
				
			||||||
 | 
						via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0
 | 
				
			||||||
 | 
					                     unicast [place7-server4 2021-06-08] (100) [AS65534i]
 | 
				
			||||||
 | 
						via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0
 | 
				
			||||||
 | 
					                     unicast [place7-server2 2021-06-07] (100) [AS65534i]
 | 
				
			||||||
 | 
						via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0
 | 
				
			||||||
 | 
					                     unicast [place7-server3 2021-06-07] (100) [AS65534i]
 | 
				
			||||||
 | 
						via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0
 | 
				
			||||||
 | 
					...
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Which results into the following kernel route:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					2a0a:e5c0:13:e2::/108 proto bird metric 32
 | 
				
			||||||
 | 
						nexthop via 2a0a:e5c0:13:0:224:81ff:fee0:db7a dev eth0 weight 1
 | 
				
			||||||
 | 
						nexthop via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 dev eth0 weight 1
 | 
				
			||||||
 | 
						nexthop via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 dev eth0 weight 1
 | 
				
			||||||
 | 
						nexthop via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc dev eth0 weight 1 pref medium
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## TL;DR
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We know, a TL;DR at the end is not the right thing to do, but hey, we
 | 
				
			||||||
 | 
					are at ungleich, aren't we?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In a nutshell, with IPv6 the concept of **Ingress**,
 | 
				
			||||||
 | 
					**Service** and the **LoadBalancer** ServiceType
 | 
				
			||||||
 | 
					types need to be revised, as IPv6 allows direct access without having
 | 
				
			||||||
 | 
					to jump through hoops.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If you are interesting in continuing the discussion,
 | 
				
			||||||
 | 
					we are there for you in
 | 
				
			||||||
 | 
					**the #hacking:ungleich.ch Matrix channel**
 | 
				
			||||||
 | 
					[you can signup here if you don't have an
 | 
				
			||||||
 | 
					account](https://chat.with.ungleich.ch).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Or if you are interested in an IPv6 only kubernetes cluster,
 | 
				
			||||||
 | 
					drop a mail to **support**-at-**ungleich.ch**.
 | 
				
			||||||
| 
						 | 
					@ -0,0 +1,32 @@
 | 
				
			||||||
 | 
					title: Building stateless redundant IPv6 routers
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					pub_date: 2021-04-21
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					author: ungleich virtualisation team
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					twitter_handle: ungleich
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_discoverable: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					abstract:
 | 
				
			||||||
 | 
					It's time for IPv6 in docker, too.
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					body:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					interface eth1.2
 | 
				
			||||||
 | 
					{
 | 
				
			||||||
 | 
					  AdvSendAdvert on;
 | 
				
			||||||
 | 
					  MinRtrAdvInterval 3;
 | 
				
			||||||
 | 
					  MaxRtrAdvInterval 5;
 | 
				
			||||||
 | 
					  AdvDefaultLifetime 10;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					  prefix 2a0a:e5c0:0:0::/64 { };
 | 
				
			||||||
 | 
					  prefix 2a0a:e5c0:0:10::/64 { };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					  RDNSS 2a0a:e5c0:0:a::a 2a0a:e5c0:0:a::b { AdvRDNSSLifetime 6000; };
 | 
				
			||||||
 | 
					  DNSSL place5.ungleich.ch {  AdvDNSSLLifetime 6000; } ;
 | 
				
			||||||
 | 
					};
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
| 
						 | 
					@ -1,4 +1,4 @@
 | 
				
			||||||
title: Accessing IPv4 only hosts via IPv4
 | 
					title: Accessing IPv4 only hosts via IPv6
 | 
				
			||||||
---
 | 
					---
 | 
				
			||||||
pub_date: 2021-02-28
 | 
					pub_date: 2021-02-28
 | 
				
			||||||
---
 | 
					---
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
							
								
								
									
										110
									
								
								content/u/products/ungleich-sla/contents.lr
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										110
									
								
								content/u/products/ungleich-sla/contents.lr
									
										
									
									
									
										Normal file
									
								
							| 
						 | 
					@ -0,0 +1,110 @@
 | 
				
			||||||
 | 
					_discoverable: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					_hidden: no
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					title: ungleich SLA levels
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					subtitle: ungleich service level agreements
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					description1:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					What is the right SLA (service level agreement) for you? At ungleich
 | 
				
			||||||
 | 
					we know that every organisation has individual needs and resources.
 | 
				
			||||||
 | 
					Depending on your need, we offer different types of service level
 | 
				
			||||||
 | 
					agreements.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The standard SLA
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If not otherwise specified in the product or service you acquired from
 | 
				
			||||||
 | 
					us, the standard SLA will apply. This SLA covers standard operations
 | 
				
			||||||
 | 
					and is suitable for non-critical deployments. The standard SLA covers:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Target uptime of all services: 99.9%
 | 
				
			||||||
 | 
					* Service level: best effort
 | 
				
			||||||
 | 
					* Included for all products
 | 
				
			||||||
 | 
					* Support via support@ungleich.ch (answered 9-17 on work days)
 | 
				
			||||||
 | 
					* Individual Development and Support available at standard rate of 220 CHF/h
 | 
				
			||||||
 | 
					* No telephone support
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					feature1_title: Bronze SLA
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					feature1_text:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The business SLA is suited for running regular applications with a
 | 
				
			||||||
 | 
					focus of business continuity and individual support. Compared to the
 | 
				
			||||||
 | 
					standard SLA it **guarantees you responses within 5 hours** on work
 | 
				
			||||||
 | 
					days. You also can **reach our staff at extended** hours.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					feature2_title: Enterprise SLA
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					feature2_text:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The Enterprise SLA is right for you if you need high availability, but
 | 
				
			||||||
 | 
					you don't require instant reaction times from our team.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					How this works:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* All services are setup in a high availability setup (additional
 | 
				
			||||||
 | 
					  charges for resources apply)
 | 
				
			||||||
 | 
					* The target uptime of services: 99.99%
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					feature3_title: High Availability (HA) SLA
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					feature3_text:
 | 
				
			||||||
 | 
					If your application is mission critical, this is the right SLA for
 | 
				
			||||||
 | 
					you. The **HA SLA** guarantees high availability, multi location
 | 
				
			||||||
 | 
					deployments with cross-datacenter backups and fast reaction times
 | 
				
			||||||
 | 
					on 24 hours per day.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					offer1_title: Business SLA
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					offer1_text:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Target uptime of all services: 99.9%
 | 
				
			||||||
 | 
					* Service level: guaranteed reaction within 1 business day
 | 
				
			||||||
 | 
					* Development/Support (need to phrase this well): 180 CHF/h
 | 
				
			||||||
 | 
					* Telephone support (8-18 work days)
 | 
				
			||||||
 | 
					* Mail support (8-18 work days)
 | 
				
			||||||
 | 
					* Optional out of business hours hotline (360 CHF/h)
 | 
				
			||||||
 | 
					* 3'000 CHF/6 months
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					offer1_link: https://ungleich.ch/u/contact/
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					offer2_title: Enterprise SLA
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					offer2_text:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					** Requires High availability setup for all services with separate pricing
 | 
				
			||||||
 | 
					* Service level: reaction within 4 hours
 | 
				
			||||||
 | 
					* Telephone support (24x7 work days)
 | 
				
			||||||
 | 
					* Services are provided in multiple data centers
 | 
				
			||||||
 | 
					* Included out of business hours hotline (180 CHF/h)
 | 
				
			||||||
 | 
					* 18'000 CHF/6 months
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					offer2_link: https://ungleich.ch/u/contact/
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					offer3_title: HA SLA
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					offer3_text:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Uptime guarantees >= 99.99%
 | 
				
			||||||
 | 
					* Ticketing system reaction time < 3h
 | 
				
			||||||
 | 
					* 24x7 telephone support
 | 
				
			||||||
 | 
					* Applications running in multiple data centers
 | 
				
			||||||
 | 
					* Minimum monthly fee: 3000 CHF (according to individual service definition)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Individual pricing. Contact us on support@ungleich.ch for an indivual
 | 
				
			||||||
 | 
					quote and we will get back to you.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					offer3_link: https://ungleich.ch/u/contact/
 | 
				
			||||||
| 
						 | 
					@ -58,6 +58,15 @@ Checkout the [SBB
 | 
				
			||||||
page](https://www.sbb.ch/de/kaufen/pages/fahrplan/fahrplan.xhtml?von=Zurich&nach=Diesbach-Betschwanden)
 | 
					page](https://www.sbb.ch/de/kaufen/pages/fahrplan/fahrplan.xhtml?von=Zurich&nach=Diesbach-Betschwanden)
 | 
				
			||||||
for the next train.
 | 
					for the next train.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The address is:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					Hacking Villa
 | 
				
			||||||
 | 
					Hauptstrasse 28
 | 
				
			||||||
 | 
					8777 Diesbach
 | 
				
			||||||
 | 
					Switzerland
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
---
 | 
					---
 | 
				
			||||||
content1_image: hacking-villa-diesbach.jpg
 | 
					content1_image: hacking-villa-diesbach.jpg
 | 
				
			||||||
---
 | 
					---
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
| 
						 | 
					@ -45,6 +45,16 @@ Specifically for learning new technologies and to exchange knowledge
 | 
				
			||||||
we created the **Hacking & Learning channel** which can be found at
 | 
					we created the **Hacking & Learning channel** which can be found at
 | 
				
			||||||
**#hacking-and-learning:ungleich.ch**.
 | 
					**#hacking-and-learning:ungleich.ch**.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Kubernetes
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Recently (in 2021) we started to run Kubernetes cluster at
 | 
				
			||||||
 | 
					ungleich. We share our experiences in **#kubernetes:ungleich.ch**.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Ceph
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					To exchange experiences and trouble shooting for ceph, we are running
 | 
				
			||||||
 | 
					**#ceph:ungleich.ch**.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
## cdist
 | 
					## cdist
 | 
				
			||||||
 | 
					
 | 
				
			||||||
We meet for cdist discussions about using, developing and more
 | 
					We meet for cdist discussions about using, developing and more
 | 
				
			||||||
| 
						 | 
					@ -57,7 +67,7 @@ We discuss topics related to sustainability in
 | 
				
			||||||
 | 
					
 | 
				
			||||||
## More channels
 | 
					## More channels
 | 
				
			||||||
 | 
					
 | 
				
			||||||
* The main / hangout channel is **o#town-square:ungleich.ch** (also bridged
 | 
					* The main / hangout channel is **#town-square:ungleich.ch** (also bridged
 | 
				
			||||||
  to Freenode IRC as #ungleich and
 | 
					  to Freenode IRC as #ungleich and
 | 
				
			||||||
  [discord](https://discord.com/channels/706144469925363773/706144469925363776))
 | 
					  [discord](https://discord.com/channels/706144469925363773/706144469925363776))
 | 
				
			||||||
* The bi-yearly hackathon Hack4Glarus can be found in
 | 
					* The bi-yearly hackathon Hack4Glarus can be found in
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue