Merge branch 'master' of code.ungleich.ch:ungleich-public/ungleich-staticcms

This commit is contained in:
sanghee 2021-08-06 11:42:54 +02:00
commit bc5fc19ca7
21 changed files with 2216 additions and 2 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

View file

@ -0,0 +1,162 @@
title: Active-Active Routing Paths in Data Center Light
---
pub_date: 2019-11-08
---
author: Nico Schottelius
---
twitter_handle: NicoSchottelius
---
_hidden: no
---
_discoverable: no
---
abstract:
---
body:
From our last two blog articles (a, b) you probably already know that
it is spring network cleanup in [Data Center Light](https://datacenterlight.ch).
In [first blog article]() we described where we started and in
the [second blog article]() you could see how we switched our
infrastructure to IPv6 only netboot.
In this article we will dive a bit more into the details of our
network architecture and which problems we face with active-active
routers.
## Network architecture
Let's have a look at a simplified (!) diagram of the network:
... IMAGE
Doesn't look that simple, does it? Let's break it down into small
pieces.
## Upstream routers
We have a set of **upstream routers** which work stateless. They don't
have any stateful firewall rules, so both of them can work actively
without state synchronisation. Moreover, both of them peer with the
data center upstreams. These are fast routers and besides forwarding,
they also do **BGP peering** with our upstreams.
Over all the upstream routers are very simple machines, mostly running
bird and forwarding packets all day. They also provide a DNS service
(resolving and authoritative), because they are always up and can
announce service IPs via BGP or via OSPF to our network.
## Internal routers
The internal routers on the other hand provide **stateful routing**,
**IP address assignments** and **netboot services**. They are a bit
more complicated compared to the upstream routers, but they care only
a small routing table.
## Communication between the routers
All routers employ OSPF and BGP for route exchange. Thus the two
upstream routers learn about the internal networks (IPv6 only, as
usual) from the internal routers.
## Sessions
Sessions in networking are almost always an evil. You need to store
them (at high speed), you need to maintain them (updating, deleting)
and if you run multiple routers, you even need to sychronise them.
In our case the internal routers do have session handling, as they are
providing a stateful firewall. As we are using a multi router setup,
things can go really wrong if the wrong routes are being used.
Let's have a look at this a bit more in detail.
## The good path
IMAGE2: good
If a server sends out a packet via router1 and router1 eventually
receives the answer, everything is fine. The returning packet matches
the state entry that was created by the outgoing packet and the
internal router forwards the packet.
## The bad path
IMAGE3: bad
However if the
## Routing paths
If we want to go active-active routing, the server can choose between
either internal router for sending out the packet. The internal
routers again have two upstream routers. So with the return path
included, the following paths exist for a packet:
Outgoing paths:
* servers->router1->upstream router1->internet
* servers->router1->upstream router2->internet
* servers->router2->upstream router1->internet
* servers->router2->upstream router2->internet
And the returning paths are:
* internet->upstream router1->router 1->servers
* internet->upstream router1->router 2->servers
* internet->upstream router2->router 1->servers
* internet->upstream router2->router 2->servers
So on average, 50% of the routes will hit the right router on
return. However servers as well as upstream routers are not using load
balancing like ECMP, so once an incorrect path has been chosen, the
packet loss is 100%.
## Session synchronisation
In the first article we talked a bit about keepalived and that
it helps to operate routers in an active-passive mode. This did not
turn out to be the most reliable method. Can we do better with
active-active routers and session synchronisation?
Linux supports this using
[conntrackd](http://conntrack-tools.netfilter.org/). However,
conntrackd supports active-active routers on a **flow based** level,
but not on a **packet** based level. The difference is that the
following will not work in active-active routers with conntrackd:
```
#1 Packet (in the original direction) updates state in Router R1 ->
submit state to R2
#2 Packet (in the reply direction) arrive to Router R2 before state
coming from R1 has been digested.
With strict stateful filtering, Packet #2 will be dropped and it will
trigger a retransmission.
```
(quote from Pablo Neira Ayuso, see below for more details)
Some of you will mumble something like **latency** in their head right
now. If the return packet is guaranteed to arrive after state
synchronisation, then everything is fine, However, if the reply is
faster than the state synchronisation, packets will get dropped.
In reality, this will work for packets coming and going to the
Internet. However, in our setup the upstream routers are route between
different data center locations, which are in the sub micro second
latency area - i.e. lan speed, because they are interconnected with
dark fiber links.
## Take away
Before moving on to the next blog article, we would like to express
our thanks to Pablo Neira Ayuso, who gave very important input for
session based firewalls and session synchronisation.
So active-active routing seems not to have a straight forward
solution. Read in the [next blog
article](/u/blog/datacenterlight-redundant-routing-infrastructure) on
how we solved the challenge in the end.

View file

@ -0,0 +1,219 @@
title: IPv6 only netboot in Data Center Light
---
pub_date: 2021-05-01
---
author: Nico Schottelius
---
twitter_handle: NicoSchottelius
---
_hidden: no
---
_discoverable: no
---
abstract:
How we switched from IPv4 netboot to IPv6 netboot
---
body:
In our [previous blog
article](/u/blog/datacenterlight-spring-network-cleanup)
we wrote about our motivation for the
big spring network cleanup. In this blog article we show how we
started reducing the complexity by removing our dependency on IPv4.
## IPv6 first
When you found our blog, you are probably aware: everything at
ungleich is IPv6 first. Many of our networks are IPv6 only, all DNS
entries for remote access have IPv6 (AAAA) entries and there are only
rare exceptions when we utilise IPv4 for our infrastructure.
## IPv4 only Netboot
One of the big exceptions to this paradigm used to be how we boot our
servers. Because our second big paradigm is sustainability, we use a
lot of 2nd (or 3rd) generation hardware. We actually share this
passion with our friends from
[e-durable](https://recycled.cloud/), because sustainability is
something that we need to employ today and not tomorrow.
But back to the netbooting topic: For netbooting we mainly
relied on onboard network cards so far.
## Onboard network cards
We used these network cards for multiple reasons:
* they exist virtually in any server
* they usually have a ROM containing a PXE capable firmware
* it allows us to split real traffic to fiber cards and internal traffic
However using the onboard devices comes also with a couple of disadvantages:
* Their ROM is often outdated
* It requires additional cabling
## Cables
Let's have a look at the cabling situation first. Virtually all of
our servers are connected to the network using 2x 10 Gbit/s fiber cards.
On one side this provides a fast connection, but on the other side
it provides us with something even better: distances.
Our data centers employ a non-standard design due to the re-use of
existing factory halls. This means distances between servers and
switches can be up to 100m. With fiber, we can easily achieve these
distances.
Additionally, have less cables provides a simpler infrastructure
that is easier to analyse.
## Disabling onboard network cards
So can we somehow get rid of the copper cables and switch to fiber
only? It turns out that the fiber cards we use (mainly Intel X520's)
have their own ROM. So we started disabling the onboard network cards
and tried booting from the fiber cards. This worked until we wanted to
move the lab setup to production...
## Bonding (LACP) and VLAN tagging
Our servers use bonding (802.3ad) for redundant connections to the
switches and VLAN tagging on top of the bonded devices to isolate
client traffic. On the switch side we realised this using
configurations like
```
interface Port-Channel33
switchport mode trunk
mlag 33
...
interface Ethernet33
channel-group 33 mode active
```
But that does not work, if the network ROM at boot does not create an
LACP enabled link on top of which it should be doing VLAN tagging.
The ROM in our network cards **would** have allowed VLAN tagging alone
though.
To fix this problem, we reconfigured our switches as follows:
```
interface Port-Channel33
switchport trunk native vlan 10
switchport mode trunk
port-channel lacp fallback static
port-channel lacp fallback timeout 20
mlag 33
```
This basically does two things:
* If there are no LACP frames, fallback to static (non lacp)
configuration
* Accept untagged traffic and map it to VLAN 10 (one of our boot networks)
Great, our servers can now netboot from fiber! But we are not done
yet...
## IPv6 only netbooting
So how do we convince these network cards to do IPv6 netboot? Can we
actually do that at all? Our first approach was to put a custom build of
[ipxe](https://ipxe.org/) on a USB stick. We generated that
ipxe image using **rebuild-ipxe.sh** script
from the
[ungleich-tools](https://code.ungleich.ch/ungleich-public/ungleich-tools)
repository. Turns out using a USB stick works pretty well for most
situations.
## ROMs are not ROMs
As you can imagine, the ROM of the X520 cards does not contain IPv6
netboot support. So are we back at square 1? No, we are not. Because
the X520's have something that the onboard devices did not
consistently have: **a rewritable memory area**.
Let's take 2 steps back here first: A ROM is an **read only memory**
chip. Emphasis on **read only**. However, modern network cards and a
lot of devices that support on-device firmware do actually have a
memory (flash) area that can be written to. And that is what aids us
in our situation.
## ipxe + flbtool + x520 = fun
Trying to write ipxe into the X520 cards initially failed, because the
network card did not recognise the format of the ipxe rom file.
Luckily the folks in the ipxe community already spotted that problem
AND fixed it: The format used in these cards is called FLB. And there
is [flbtool](https://github.com/devicenull/flbtool/), which allows you
to wrap the ipxe rom file into the FLB format. For those who want to
try it yourself (at your own risk!), it basically involves:
* Get the current ROM from the card (try bootutil64e)
* Extract the contents from the rom using flbtool
* This will output some sections/parts
* Locate one part that you want to overwrite with iPXE (a previous PXE
section is very suitable)
* Replace the .bin file with your iPXE rom
* Adjust the .json file to match the length of the new binary
* Build a new .flb file using flbtool
* Flash it onto the card
While this is a bit of work, it is worth it for us, because...:
## IPv6 only netboot over fiber
With the modified ROM, basically loading iPXE at start, we can now
boot our servers in IPv6 only networks. On our infrastructure side, we
added two **tiny** things:
We use ISC dhcp with the following configuration file:
```
option dhcp6.bootfile-url code 59 = string;
option dhcp6.bootfile-url "http://[2a0a:e5c0:0:6::46]/ipxescript";
subnet6 2a0a:e5c0:0:6::/64 {}
```
(that is the complete configuration!)
And we used radvd to announce that there are other information,
indicating clients can actually query the dhcpv6 server:
```
interface bond0.10
{
AdvSendAdvert on;
MinRtrAdvInterval 3;
MaxRtrAdvInterval 5;
AdvDefaultLifetime 600;
# IPv6 netbooting
AdvOtherConfigFlag on;
prefix 2a0a:e5c0:0:6::/64 { };
RDNSS 2a0a:e5c0:0:a::a 2a0a:e5c0:0:a::b { AdvRDNSSLifetime 6000; };
DNSSL place5.ungleich.ch { AdvDNSSLLifetime 6000; } ;
};
```
## Take away
Being able to reduce cables was one big advantage in the beginning.
Switching to IPv6 only netboot does not seem like a big simplification
in the first place, besides being able to remove IPv4 in server
networks.
However as you will see in
[the next blog posts](/u/blog/datacenterlight-active-active-routing/),
switching to IPv6 only netbooting is actually a key element on
reducing complexity in our network.

View file

@ -0,0 +1,222 @@
title: Redundant routing infrastructure at Data Center Light
---
pub_date: 2021-05-01
---
author: Nico Schottelius
---
twitter_handle: NicoSchottelius
---
_hidden: no
---
_discoverable: no
---
abstract:
---
body:
In case you have missed the previous articles, you can
get [an introduction to the Data Center Light spring
cleanup](/u/blog/datacenterlight-spring-network-cleanup),
see [how we switched to IPv6 only netboot](/u/blog/datacenterlight-ipv6-only-netboot)
or read about [the active-active routing
problems](/u/blog/datacenterlight-active-active-routing/).
In this article we will show how we finally solved the routing issue
conceptually as well as practically.
## Active-active or passive-active routing?
In the [previous blog article](/u/blog/datacenterlight-active-active-routing/)
we reasoned that active-active routing, even with session
synchronisation does not have a straight forward solution in our
case. However in the
[first blog article](/u/blog/datacenterlight-spring-network-cleanup)
we reasoned that active-passive routers with VRRP and keepalived are
not stable enough either
So which path should we take? Or is there another solution?
## Active-Active-Passive Routing
Let us introduce Active-Active-Passive routing. Something that sounds
strange in the first place, but is going to make sense in the next
minutes.
We do want multiple active routers, but we do not want to have to
deal with session synchronisation, which is not only tricky, but due
to its complexity can also be a source of error.
So what we are looking for is active-active routing without state
synchronisation. While this sounds like a contradiction, if we loosen
our requirement a little bit, we are able to support multiple active
routers without session synchronisation by using **routing
priorities**.
## Active-Active routing with routing priorities
Let's assume for a moment that all involved hosts (servers, clients,
routers, etc.) know about multiple routes for outgoing and incoming
traffic. Let's assume also for a moment that **we can prioritise**
those routes. Then we can create a deterministic routing path that
does not need session synchronisation.
## Steering outgoing traffic
Let's have a first look at the outgoing traffic. Can we announce
multiple routers in a network, but have the servers and clients
**prefer** one of the routers? The answer is yes!
If we checkout the manpage of
[radvd.conf(5)](https://linux.die.net/man/5/radvd.conf) we find a
setting that is named **AdvDefaultPreference**
```
AdvDefaultPreference low|medium|high
```
Using this attribute, two routers can both actively announce
themselves, but clients in the network will prefer the one with the
higher preference setting.
### Replacing radvd with bird
At this point a short side note: We have been using radvd for some
years in the Data Center Light. However recently on our
[Alpine Linux based routers](https://alpinelinux.org/), radvd started
to crash from time to time:
```
[717424.727125] device eth1 left promiscuous mode
[1303962.899600] radvd[24196]: segfault at 63f42258 ip 00007f6bdd59353b sp 00007ffc63f421b8 error 4 in ld-musl-x86_64.so.1[7f6bdd558000+48000]
[1303962.899609] Code: 48 09 c8 4c 85 c8 75 0d 49 83 c4 08 eb d4 39 f0 74 0c 49 ff c4 41 0f b6 04 24 84 c0 75 f0 4c 89 e0 41 5c c3 31 c9 0f b6 04 0f <0f> b6 14 0e 38 d0 75 07 48 ff c1 84 c0 75 ed 29 d0 c3 41 54 49 89
...
[1458460.511006] device eth0 entered promiscuous mode
[1458460.511168] radvd[27905]: segfault at 4dfce818 ip 00007f94ec1fd53b sp 00007ffd4dfce778 error 4 in ld-musl-x86_64.so.1[7f94ec1c2000+48000]
[1458460.511177] Code: 48 09 c8 4c 85 c8 75 0d 49 83 c4 08 eb d4 39 f0 74 0c 49 ff c4 41 0f b6 04 24 84 c0 75 f0 4c 89 e0 41 5c c3 31 c9 0f b6 04 0f <0f> b6 14 0e 38 d0 75 07 48 ff c1 84 c0 75 ed 29 d0 c3 41 54 49 89
...
```
Unfortunately it seems that either the addresses timed out or that
radvd was able to send a message de-announcing itself prior to the
crash, causing all clients to withdraw their addresses. This is
especially problematic, if you run a [ceph](https://ceph.io/) cluster
and the servers don't have IP addresses anymore...
While we did not yet investigate the full cause of this, we had a very
easy solution: as all of our routers run
[bird](https://bird.network.cz/) and it also supports sending router
advertisements, we replaced radvd with bird. The configuration is
actually pretty simple:
```
protocol radv {
# Internal
interface "eth1.5" {
max ra interval 5; # Fast failover with more routers
other config yes; # dhcpv6 boot
default preference high;
};
rdnss {
lifetime 3600;
ns 2a0a:e5c0:0:a::a;
ns 2a0a:e5c0:0:a::b;
};
dnssl {
lifetime 3600;
domain "place5.ungleich.ch";
};
}
```
## Steering incoming traffic
As the internal and the upstream routers are in the same data center,
we can use an IGP like OSPF to distribute the routes to the internal
routers. And OSPF actually has this very neat metric called **cost**.
So for the router that sets the **default preference high** for the
outgoing routes, we keep the cost at 10, for the router that
ses the **default preference low** we set the cost at 20. The actual
bird configuration on a router looks like this:
```
define ospf_cost = 10;
...
protocol ospf v3 ospf6 {
instance id 0;
ipv6 {
import all;
export none;
};
area 0 {
interface "eth1.*" {
authentication cryptographic;
password "weshouldhaveremovedthisfortheblogpost";
cost ospf_cost;
};
};
}
```
## Incoming + Outgoing = symmetric paths
With both directions under our control, we now have enabled symmetric
routing in both directions. Thus as long as the first router is alive,
all traffic will be handled by the first router.
## Failover scenario
In case the first router fails, clients have a low life time of 15
seconds (3x **max ra interval**)
for their routes and they will fail over to the 2nd router
automatically. Existing sessions will not continue to work, but that
is ok for our setup. When the first router with the higher priority
comes back, there will be again an interruption, but clients will
automatically change their paths.
And so will the upstream routers, as OSPF is a quick protocol that
updates alive routers and routes.
## IPv6 enables active-active-passive routing architectures
At ungleich it almost always comes back to the topic of IPv6, albeit
for a good reason. You might remember that we claimed in the
[IPv6 only netboot](/u/blog/datacenterlight-ipv6-only-netboot) article
that this is reducing complexity? If you look at the above example,
you might not spot it directly, but going IPv6 only is actually an
enabler for our setup:
We **only deploy router advertisements** using bird. We are **not using DHCPv4**
or **IPv4** for accessing our servers. Both routers run a dhcpv6
service in parallel, with the "boot server" pointing to themselves.
Besides being nice and clean,
our whole active-active-passive routing setup **would not work with
IPv4**, because dhcpv4 servers do not have the same functionality to
provide routing priorities.
## Take away
You can see that trying to solve one problem ("unreliable redundant
router setup") entailed a slew of changes, but in the end made our
infrastructure much simpler:
* No dual stack
* No private IPv4 addresses
* No actively communicating keepalived
* Two daemons less to maintain (keepalived, radvd)
We also avoided complex state synchronisation and deployed only Open
Source Software to address our problems. Furthermore hardware that
looked like unusable in modern IPv6 networks can also be upgraded with
Open Source Software (ipxe) and enables us to provide more sustainable
infrastructures.
We hope you enjoyed our spring cleanup blog series. The next one will
be coming, because IT infrastructures always evolve. Until then:
feel free to [join our Open Soure Chat](https://chat.with.ungleich.ch)
and join the discussion.

View file

@ -0,0 +1,161 @@
title: Data Center Light: Spring network cleanup
---
pub_date: 2021-05-01
---
author: Nico Schottelius
---
twitter_handle: NicoSchottelius
---
_hidden: no
---
_discoverable: no
---
abstract:
From today on ungleich offers free, encrypted IPv6 VPNs for hackerspaces
---
body:
## Introduction
Spring is the time for cleanup. Cleanup up your apartment, removing
dust from the cabinet, letting the light shine through the windows,
or like in our case: improving the networking situation.
In this article we give an introduction of where we started and what
the typical setup used to be in our data center.
## Best practice
When we started [Data Center Light](https://datacenterlight.ch) in
2017, we orientated ourselves at "best practice" for networking. We
started with IPv6 only networks and used RFC1918 network (10/8) for
internal IPv4 routing.
And we started with 2 routers for every network to provide
redundancy.
## Router redundancy
So what do you do when you have two routers? In the Linux world the
software [keepalived](https://keepalived.org/)
is very popular to provide redundant routing
using the [VRRP protocol](https://en.wikipedia.org/wiki/Virtual_Router_Redundancy_Protocol).
## Active-Passive
While VRRP is designed to allow multiple (not only two) routers to
co-exist in a network, its design is basically active-passive: you
have one active router and n passive routers, in our case 1
additional.
## Keepalived: a closer look
A typical keepalived configuration in our network looked like this:
```
vrrp_instance router_v4 {
interface INTERFACE
virtual_router_id 2
priority PRIORITY
advert_int 1
virtual_ipaddress {
10.0.0.1/22 dev eth1.5 # Internal
}
notify_backup "/usr/local/bin/vrrp_notify_backup.sh"
notify_fault "/usr/local/bin/vrrp_notify_fault.sh"
notify_master "/usr/local/bin/vrrp_notify_master.sh"
}
vrrp_instance router_v6 {
interface INTERFACE
virtual_router_id 1
priority PRIORITY
advert_int 1
virtual_ipaddress {
2a0a:e5c0:1:8::48/128 dev eth1.8 # Transfer for routing from outside
2a0a:e5c0:0:44::7/64 dev bond0.18 # zhaw
2a0a:e5c0:2:15::7/64 dev bond0.20 #
}
}
```
This is a template that we distribute via [cdist](https:/cdi.st). The
strings INTERFACE and PRIORITY are replaced via cdist. The interface
field defines which interface to use for VRRP communication and the
priority field determines which of the routers is the active one.
So far, so good. However let's have a look at a tiny detail of this
configuration file:
```
notify_backup "/usr/local/bin/vrrp_notify_backup.sh"
notify_fault "/usr/local/bin/vrrp_notify_fault.sh"
notify_master "/usr/local/bin/vrrp_notify_master.sh"
```
These three lines basically say: "start something if you are the
master" and "stop something in case you are not". And why did we do
this? Because of stateful services.
## Stateful services
A typical shell script that we would call containes lines like this:
```
/etc/init.d/radvd stop
/etc/init.d/dhcpd stop
```
(or start in the case of the master version)
In earlier days, this even contained openvpn, which was running on our
first generation router version. But more about OpenVPN later.
The reason why we stopped and started dhcp and radvd is to make
clients of the network use the active router. We used radvd to provide
IPv6 addresses as the primary access method to servers. And we used
dhcp mainly to allow servers to netboot. The active router would
carry state (firewall!) and thus the flow of packets always need to go
through the active router.
Restarting radvd on a different machine keeps the IPv6 addresses the
same, as clients assign then themselves using EUI-64. In case of dhcp
(IPv4) we would have used hardcoded IPv4 addresses using a mapping of
MAC address to IPv4 address, but we opted out for this. The main
reason is that dhcp clients re-request their same leas and even if an
IPv4 addresses changes, it is not really of importance.
During a failover this would lead to a few seconds interrupt and
re-establishing sessions. Given that routers are usually rather stable
and restarting them is not a daily task, we initially accepted this.
## Keepalived/VRRP changes
One of the more tricky things is changes to keepalived. Because
keepalived uses the *number of addresses and routes* to verify
that the received VRRP packet matches its configuration, adding or
deleting IP addresses and routes, causes a problem:
While one router was updated, the number of IP addresses or routes is
different. This causes both routers to ignore the others VRRP messages
and both routers think they should be the master process.
This leads to the problem that both routers receive client and outside
traffic. This causes the firewall (nftables) to not recognise
returning packets, if they were sent out by router1, but received back
by router2 and, because nftables is configured *stateful*, will drop
the returning packet.
However not only changes to the configuration can trigger this
problem, but also any communication problem between the two
routers. Since 2017 we experienced it multiple times that keepalived
was unable to receive or send messages from the other router and thus
both of them again became the master process.
## Take away
While in theory keepalived should improve the reliability, in practice
the number of problems due to double master situations we had, made us
question whether the keepalived concept is the fitting one for us.
You can read how we evolved from this setup in
[the next blog article](/u/blog/datacenterlight-ipv6-only-netboot/).

View file

@ -0,0 +1,192 @@
title: GLAMP #1 2021
---
pub_date: 2021-07-17
---
author: ungleich
---
twitter_handle: ungleich
---
_hidden: no
---
_discoverable: yes
---
abstract:
The first un-hack4glarus happens as a camp - Thursday 2021-08-19 to Sunday 2021-08-22.
---
body:
## Tl;DR
Get your tent, connect it to power and 10Gbit/s Internet in the midst
of the Glarner mountains. Happenening Thursday 2021-08-19 to Sunday 2021-08-22.
Apply for participation by mail (information at the bottom of the page).
## Introduction
It has been some time since our
[last Hack4Glarus](https://hack4glarus.ch) and we have been missing
all our friends, hackers and participants. At ungleich we have been
watching the development of the Coronavirus world wide and as you
might know, we have decided against a Hack4Glarus for this summer, as
the Hack4Glarus has been an indoor event so far.
## No Hack4Glarus = GLAMP
However, we want to try a different format that ensures proper
safety. Instead of an indoor Hack4Glarus in Linthal, we introduce
the Glarus Camp (or GLAMP in short) to you. An outdoor event with
sufficient space for distancing. As a camping site we can use the
surrounding of the Hacking Villa, supported by the Hacking Villa
facilities.
Compared to the Hack4Glarus, the GLAMP will focus more on
*relaxation*, *hangout* than being a hackathon. We think times are
hard enough to give everyone a break.
## The setting
Many of you know the [Hacking Villa](/u/projects/hacking-villa/) in
Diesbach already. Located just next to the pretty waterfall and the amazing
Legler Areal. The villa is connected with 10 Gbit/s to the
[Data Center Light](/u/projects/data-center-light/) and offers a lot
of fun things to do.
## Coronavirus measures beforehand
To ensure safety for everyone, we ask everyone attending to provide a
reasonable proof of not spreading the corona virus with one of the
following proofs:
* You have been vaccinated
* You had the corona virus and you are symptom free for at least 14
days
* You have been tested with a PCR test (7 days old at maximum) and the
result was negative
All participants will be required to take an short antigen test on
site.
**Please do not attend if you feel sick for the safety of everyone else.**
## Coronavirus measures on site
To keep the space safe on site as well, we ask you to follow these
rules:
* Sleep in your own tent
* Wear masks inside the Hacking Villa
* Especially if you are preparing food shared with others
* Keep distance and respect others safety wishes
## Hacking Villa Facilities
* Fast Internet (what do you need more?)
* A shared, open area outside for hacking
* Toilets and bath room located inside
## What to bring
* A tent + sleeping equipment
* Fun stuff
* Your computer
* Wifi / IoT / Hacking things
* If you want wired Internet in your tent: a 15m+ Ethernet cable
* WiFi will be provided everywhere
## What is provided
* Breakfast every morning
* A place for a tent
* Power to the tent (Swiss plug)
* WiFi to the tent
* Traditional closing event spaghetti
## What you can find nearby
* A nearby supermarket (2km) reachable by foot, scooter, bike
* A waterfall + barbecue place (~400m)
* Daily attractions such as hacking, hiking, biking, hanging out
## Registration
As the space is limited, we can accomodate about 10 tents (roughly 23
people). To register, send an email to support@ungleich.ch based on
the following template:
```
Subject: GLAMP#1 2021
For each person with you (including yourself):
Non Coronavirus proof:
(see requirements on the glamp page)
Name(s):
(how you want to be called)
Interests:
(will be shown to others at the glamp)
Skills:
(will be shown to others at the glamp)
Food interests:
(we use this for pooling food orders)
What I would like to do:
(will be shown to others at the glamp)
```
The particaption fee is 70 CHF/person (to be paid on arrival).
## Time, Date and Location
* Arrival possible from Wednesday 2021-08-18 16:00
* GLAMP#1 starts officially on Thursday 2021-08-19, 1000
* GLAMP#1 closing lunch Sunday 2021-08-22, 1200
* GLAMP#1 ends officially on to Sunday 2021-08-22, 1400
Location: [Hacking Villa](/u/projects/hacking-villa/)
## FAQ
### Where do I get Internet?
It is available everywhere at/around the Hacking Villa via WiFi. For
cable based Internet bring a 15m+ Ethernet cable.
### Where do I get Electricity?
You'll get electricity directly to the tent. Additionally the shared
area also has electricity. You can also bring solar panels, if you
like.
### Where do I get food?
Breakfast is provided by us. But what about the rest of the day?
There are a lot of delivery services available, ranging from Pizza,
Tibetan, Thai, Swiss (yes!), etc. available.
Nearby are 2 Volg supermarkets, next Coop is in Schwanden, bigger
Migros in Glarus and very big Coop can be found in Netstal. The Volg
is reachable by foot, all others are reachable by train or bike.
There is also a kitchen inside the Hacking Villa for cooking.
There is also a great barbecue place just next to the waterfall.
### What can I do at the GLAMP?
There are
[alot](http://hyperboleandahalf.blogspot.com/2010/04/alot-is-better-than-you-at-everything.html)
of opportunities at the GLAMP:
You can ...
* just relax and hangout
* hack on project that you post poned for long
* hike up mountains (up to 3612m! Lower is also possible)
* meet other hackers
* explore the biggest water power plant in Europe (Linth Limmern)
* and much much more!

Binary file not shown.

After

Width:  |  Height:  |  Size: 380 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 167 KiB

View file

@ -0,0 +1,123 @@
title: Configuring bind to only forward DNS to a specific zone
---
pub_date: 2021-07-25
---
author: ungleich
---
twitter_handle: ungleich
---
_hidden: no
---
_discoverable: yes
---
abstract:
Want to use BIND for proxying to another server? This is how you do it.
---
body:
## Introduction
In this article we'll show you an easy solution to host DNS zones on
IPv6 only or private DNS servers. The method we use here is **DNS
forwarding** as offered in ISC BIND, but one could also see this as
**DNS proxying**.
## Background
Sometimes you might have a DNS server that is authoritative for DNS
data, but is not reachable for all clients. This might be the case for
instance, if
* your DNS server is IPv6 only: it won't be directly reachable from
the IPv4 Internet
* your DNS server is running in a private network, either IPv4 or IPv6
In both cases, you need something that is publicly reachable, to
enable clients to access the zone, like show in the following picture:
![](dns-proxy-forward.png)
## The problem: Forwarding requires recursive queries
ISC Bind allows to forward queries to another name server. However to
do so, it need to be configured to allow handling recursive querying.
However, if we allow recursive querying by any client, we basically
create an [Open DNS resolver, which can be quite
dangerous](https://www.ncsc.gov.ie/emailsfrom/DDoS/DNS/).
## The solution
ISC Bind by default has a root hints file compiled in, which allows it
to function as a resolver without any additional configuration
files. That is great, but not if you want to prevent it to work as
forwarder as described above. But we can easily fix that problem. Now,
let's have a look at a real world use case, step-by-step:
### Step 1: Global options
In the first step, we need to set the global to allow recursion from
anyone, as follows:
```
options {
directory "/var/cache/bind";
listen-on-v6 { any; };
allow-recursion { ::/0; 0.0.0.0/0; };
};
```
However as mentioned above, this would create an open resolver. To
prevent this, let's disable the root hints:
### Step 2: Disable root hints
The root hints are served in the root zone, also know as ".". To
disable it, we give bind an empty file to use:
```
zone "." {
type hint;
file "/dev/null";
};
```
Note: in case you do want to allow recursive function for some
clients, **you can create multiple DNS views**.
### Step 3: The actual DNS file
In our case, we have a lot of IPv6 only kubernetes clusters, which are
named `xx.k8s.ooo` and have a world wide rachable CoreDNS server built
in. In this case, we want to allow the domain c1.k8s.ooo to be world
reachable, so we configure the dual stack server as follows:
```
zone "c1.k8s.ooo" {
type forward;
forward only;
forwarders { 2a0a:e5c0:2:f::a; };
};
```
### Step 4: adjusting the zone file
In case you are running an IPv6 only server, you need to configure the
upstream DNS server. In our case this looks as follows:
```
; The domain: c1.k8s.ooo
c1 NS kube-dns.kube-system.svc.c1
; The IPv6 only DNS server
kube-dns.kube-system.svc.c1 AAAA 2a0a:e5c0:2:f::a
; The forwarding IPv4 server
kube-dns.kube-system.svc.c1 A 194.5.220.43
```
## DNS, IPv6, Kubernetes?
If you are curious to learn more about either of these topics, feel
[free to join us on our chat](/u/projects/open-chat/).

Binary file not shown.

After

Width:  |  Height:  |  Size: 154 KiB

View file

@ -0,0 +1,210 @@
title: Support for IPv6 link local addresses in browsers
---
pub_date: 2021-06-14
---
author: ungleich
---
twitter_handle: ungleich
---
_hidden: no
---
_discoverable: yes
---
abstract:
Tracking the progress of browser support for link local addresses
---
body:
## Introduction
Link Local addresses
([fe80::/10](https://en.wikipedia.org/wiki/Link-local_address)) are
used for addressing devices in your local subnet. They can be
automatically generated and using the IPv6 multicast address
**ff02::1**, all hosts on the local subnet can easily be located.
However browsers like Chrome or Firefox do not support **entering link
local addresses inside a URL**, which prevents accessing devices
locally with a browser, for instance for configuring them.
Link local addresses need **zone identifiers** to specify which
network device to use as an outgoing interface. This is because
**you have link local addresses on every interface** and your network
stack does not know on its own, which interface to use. So typically a
link local address is something on the line of
**fe80::fae4:e3ff:fee2:37a4%eth0**, where **eth0** is the zone
identifier.
Them problem is becoming more emphasised, as the world is moving more
and more towards **IPv6 only networks**.
You might not even know the address of your network equipment anymore,
but you can easily locate iit using the **ff02::1 multicast
address**. So we need support in browsers, to allow network
configurations.
## Status of implementation
The main purpose of this document is to track the status of the
link-local address support in the different browsers and related
standards. The current status is:
* Firefox says whatwg did not define it
* Whatwg says zone id is intentionally omitted and and reference w3.org
* w3.org has a longer reasoning, but it basically boils down to
"Firefox and chrome don't do it and it's complicated and nobody needs it"
* Chromium says it seems not to be worth the effort
Given that chain of events, if either Firefox, Chrome, W3.org or
Whatwg where to add support for it, it seems likely that the others
would be following.
## IPv6 link local address support in Firefox
The progress of IPv6 link local addresses for Firefox is tracked
on [the mozilla
bugzilla](https://bugzilla.mozilla.org/show_bug.cgi?id=700999). The
current situation is that Firefox references to the lack of
standardisation by whatwg as a reason for not implementing it. Quoting
Valentin Gosu from the Mozilla team:
```
The main reason the zone identifier is not supported in Firefox is
that parsing URLs is hard. You'd think we can just pass whatever
string to the system API and it will work or fail depending on whether
it's valid or not, but that's not the case. In bug 1199430 for example
it was apparent that we need to make sure that the hostname string is
really valid before passing it to the OS.
I have no reason to oppose zone identifiers in URLs as long as the URL
spec defines how to parse them. As such, I encourage you to engage
with the standard at https://github.com/whatwg/url/issues/392 instead
of here.
Thank you!
```
## IPv6 link local address support in whatwg
The situation at [whatwg](https://whatwg.org/) is that there is a
[closed bug report on github](https://github.com/whatwg/url/issues/392)
and [in the spec it says](https://url.spec.whatwg.org/#concept-ipv6)
that
Support for <zone_id> is intentionally omitted.
That paragraph links to a bug registered at w3.org (see next chapter).
## IPv6 link local address support at w3.org
At [w3.org](https://www.w3.org/) there is a
bug titled
[Support IPv6 link-local
addresses?](https://www.w3.org/Bugs/Public/show_bug.cgi?id=27234#c2)
that is set to status **RESOLVED WONTFIX**. It is closed basically
based on the following statement from Ryan Sleevi:
```
Yes, we're especially not keen to support these in Chrome and have
repeatedly decided not to. The platform-specific nature of <zone_id>
makes it difficult to impossible to validate the well-formedness of
the URL (see https://tools.ietf.org/html/rfc4007#section-11.2 , as
referenced in 6874, to fully appreciate this special hell). Even if we
could reliably parse these (from a URL spec standpoint), it then has
to be handed 'somewhere', and that opens a new can of worms.
Even 6874 notes how unlikely it is to encounter these in practice -
"Thus, URIs including a
ZoneID are unlikely to be encountered in HTML documents. However, if
they do (for example, in a diagnostic script coded in HTML), it would
be appropriate to treat them exactly as above."
Note that a 'dumb' parser may not be sufficient, as the Security Considerations of 6874 note:
"To limit this risk, implementations MUST NOT allow use of this format
except for well-defined usages, such as sending to link-local
addresses under prefix fe80::/10. At the time of writing, this is
the only well-defined usage known."
And also
"An HTTP client, proxy, or other intermediary MUST remove any ZoneID
attached to an outgoing URI, as it has only local significance at the
sending host."
This requires a transformative rewrite of any URLs going out the
wire. That's pretty substantial. Anne, do you recall the bug talking
about IP canonicalization (e.g. http://127.0.0.1 vs
http://[::127.0.0.1] vs http://012345 and friends?) This is
conceptually a similar issue - except it's explicitly required in the
context of <zone_id> that the <zone_id> not be emitted.
There's also the issue that zone_id precludes/requires the use of APIs
that user agents would otherwise prefer to avoid, in order to
'properly' handle the zone_id interpretation. For example, Chromium on
some platforms uses a built in DNS resolver, and so our address lookup
functions would need to define and support <zone_id>'s and map them to
system concepts. In doing so, you could end up with weird situations
where a URL works in Firefox but not Chrome, even though both
'hypothetically' supported <zone_id>'s, because FF may use an OS
routine and Chrome may use a built-in routine and they diverge.
Overall, our internal consensus is that <zone_id>'s are bonkers on
many grounds - the technical ambiguity (and RFC 6874 doesn't really
resolve the ambiguity as much as it fully owns it and just says
#YOLOSWAG) - and supporting them would add a lot of complexity for
what is explicitly and admittedly a limited value use case.
```
This bug references the Mozilla Firefox bug above and
[RFC3986 (replaced by RFC
6874)](https://datatracker.ietf.org/doc/html/rfc6874#section-2).
## IPv6 link local address support in Chrome / Chromium
On the chrome side there is a
[huge bug
report](https://bugs.chromium.org/p/chromium/issues/detail?id=70762)
which again references a huge number of other bugs that try to request
IPv6 link local support, too.
The bug was closed by cbentzel@chromium.org stating:
```
There are a large number of special cases which are required on core
networking/navigation/etc. and it does not seem like it is worth the
up-front and ongoing maintenance costs given that this is a very
niche - albeit legitimate - need.
```
The bug at chromium has been made un-editable so it is basically
frozen, besides people have added suggestions to the ticket on how to
solve it.
## Work Arounds
### IPv6 link local connect hack
Peter has [documented on the IPv6 link local connect
hack](https://website.peterjin.org/wiki/Snippets:IPv6_link_local_connect_hack)
to make firefox use **fe90:0:[scope id]:[IP address]** to reach
**fe80::[IP address]%[scope id]**. Checkout his website for details!
### IPv6 hack using ip6tables
Also from Peter is the hint that you can also use newer iptable
versions to achieve a similar mapping:
"On modern Linux kernels you can also run
```ip6tables -t nat -A OUTPUT -d fef0::/64 -j NETMAP --to fe80::/64```
if you have exactly one outbound interface, so that fef0::1 translates
to fe80::1"
Thanks again for the pointer!
## Other resources
If you are aware of other resources regarding IPv6 link local support
in browsers, please join the [IPv6.chat](https://IPv6.chat) and let us
know about it.

View file

@ -0,0 +1,144 @@
title: Automatic A and AAAA DNS entries with NAT64 for kubernetes?
---
pub_date: 2021-06-24
---
author: ungleich
---
twitter_handle: ungleich
---
_hidden: no
---
_discoverable: yes
---
abstract:
Given a kubernetes cluster and NAT64 - how do you create DNS entries?
---
body:
## The DNS kubernetes quiz
Today our blog entry does not (yet) show a solution, but more a tricky
quiz on creating DNS entries. The problem to solve is the following:
* How to make every IPv6 only service in kubernetes also IPv4
reachable?
Let's see who can solve it first or the prettiest. Below are some
thoughts on how to approach this problem.
## The situation
Assume your kubernetes cluster is IPv6 only and all services
have proper AAAA DNS entries. This allows you
[to directly receive traffic from the
Internet](/u/blog/kubernetes-without-ingress/) to
your kubernetes services.
Now to make that service also IPv4 reachable, we can deploy NAT64
service that maps an IPv4 address outside the cluster to an IPv6 service
address inside the cluster:
```
A.B.C.D --> 2001:db8::1
```
So all traffic to that IPv4 address is converted to IPv6 by the
external NAT64 translator.
## The proxy service
Let's say the service running on 2001:db8::1 is named "ipv4-proxy" and
thus reachable at ipv4-proxy.default.svc.example.com.
What we want to achieve is to expose every possible service
inside the cluster **also via IPv4**. For this purpose we have created
an haproxy container that access *.svc.example.com and forwards it via
IPv6.
So the actual flow would look like:
```
IPv4 client --[ipv4]--> NAT64 -[ipv6]-> proxy service
|
|
v
IPv6 client ---------------------> kubernetes service
```
## The DNS dilemma
It would be very tempting to create a wildcard DNS entry or to
configure/patch CoreDNS to also include an A entry for every service
that is:
```
*.svc IN A A.B.C.D
```
So essentially all services resolve to the IPv4 address A.B.C.D. That
however would also influence the kubernetes cluster, as pods
potentially resolve A entries (not only AAAA) as well.
As the containers / pods do not have any IPv4 address (nor IPv4
routing), access to IPv4 is not possible. There are various outcomes
of this situation:
1. The software in the container does happy eyeballs and tries both
A/AAAA and uses the working IPv6 connection.
2. The software in the container misbehaves and takes the first record
and uses IPv4 (nodejs is known to have or had a broken resolver
that did exactly that).
So adding that wildcard might not be the smartest option. And
additionally it is unclear whether coreDNS would support that.
## Alternative automatic DNS entries
The *.svc names in a kubernetes cluster are special in the sense that
they are used for connecting internally. What if coreDNS (or any other
DNS) server would instead of using *.svc, use a second subdomain like
*abc*.*namespace*.v4andv6.example.com and generate the same AAAA
record as for the service and a static A record like describe above?
That could solve the problem. But again, does coreDNS support that?
## Automated DNS entries in other zones
Instead of fully automated creating the entries as above, another
option would be to specify DNS entries via annotations in a totally
different zone, if coreDNS was supporting this. So let's say we also
have control over example.org and we could instruct coreDNS to create
the following entries automatically with an annotation:
```
abc.something.example.org AAAA <same as the service IP>
abc.something.example.org A <a static IPv4 address A.B.C.D>
```
In theory this might be solved via some scripting, maybe via a DNS
server like powerDNS?
## Alternative solution with BIND
The bind DNS server, which is not usually deployed in a kubernetes
cluster, supports **views**. Views enable different replies to the
same query depending on the source IP address. Thus in theory
something like that could be done, assuming a secondary zone
*example.org*:
* If the request comes from the kubernetes cluster, return a CNAME
back to example.com.
* If the request comes from outside the kubernetes cluster, return an
A entry with the static IP
* Unsolved: how to match on the AAAA entries (because we don't CNAME
with the added A entry)
## Other solution?
As you can see, mixing the dynamic IP generation and coupling it with
static DNS entries for IPv4 resolution is not the easiest tasks. If
you have a smart idea on how to solve this without manually creating
entries for each and every service,
[give us a shout!](/u/contact)

View file

@ -0,0 +1,227 @@
title: Making kubernetes kube-dns publicly reachable
---
pub_date: 2021-06-13
---
author: ungleich
---
twitter_handle: ungleich
---
_hidden: no
---
_discoverable: yes
---
abstract:
Looking into IPv6 only DNS provided by kubernetes
---
body:
## Introduction
If you have seen our
[article about running kubernetes
Ingress-less](/u/blog/kubernetes-without-ingress/), you are aware that
we are pushing IPv6 only kubernetes clusters at ungleich.
Today, we are looking at making the "internal" kube-dns service world
reachable using IPv6 and global DNS servers.
## The kubernetes DNS service
If you have a look at your typical k8s cluster, you will notice that
you usually have two coredns pods running:
```
% kubectl -n kube-system get pods -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
coredns-558bd4d5db-gz5c7 1/1 Running 0 6d
coredns-558bd4d5db-hrzhz 1/1 Running 0 6d
```
These pods are usually served by the **kube-dns** service:
```
% kubectl -n kube-system get svc -l k8s-app=kube-dns
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 2a0a:e5c0:13:e2::a <none> 53/UDP,53/TCP,9153/TCP 6d1h
```
As you can see, the kube-dns service is running on a publicly
reachable IPv6 address.
## IPv6 only DNS
IPv6 only DNS servers have one drawback: they cannot be reached via DNS
recursions, if the resolver is IPv4 only.
At [ungleich we run a variety of
services](https://redmine.ungleich.ch/projects/open-infrastructure/wiki)
to make IPv6 only services usable in the real world. In case of DNS,
we are using **DNS forwarders**. They are acting similar to HTTP
proxies, but for DNS.
So in our main DNS servers, dns1.ungleich.ch, dns2.ungleich.ch
and dns3.ungleich.ch we have added the following configuration:
```
zone "k8s.place7.ungleich.ch" {
type forward;
forward only;
forwarders { 2a0a:e5c0:13:e2::a; };
};
```
This tells the DNS servers to forward DNS queries that come in for
k8s.place7.ungleich.ch to **2a0a:e5c0:13:e2::a**.
Additionally we have added **DNS delegation** in the
place7.ungleich.ch zone:
```
k8s NS dns1.ungleich.ch.
k8s NS dns2.ungleich.ch.
k8s NS dns3.ungleich.ch.
```
## Using the kubernetes DNS service in the wild
With this configuration, we can now access IPv6 only
kubernetes services directly from the Internet. Let's first discover
the kube-dns service itself:
```
% dig kube-dns.kube-system.svc.k8s.place7.ungleich.ch. aaaa
; <<>> DiG 9.16.16 <<>> kube-dns.kube-system.svc.k8s.place7.ungleich.ch. aaaa
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23274
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: f61925944f5218c9ac21e43960c64f254792e60f2b10f3f5 (good)
;; QUESTION SECTION:
;kube-dns.kube-system.svc.k8s.place7.ungleich.ch. IN AAAA
;; ANSWER SECTION:
kube-dns.kube-system.svc.k8s.place7.ungleich.ch. 27 IN AAAA 2a0a:e5c0:13:e2::a
;; AUTHORITY SECTION:
k8s.place7.ungleich.ch. 13 IN NS kube-dns.kube-system.svc.k8s.place7.ungleich.ch.
```
As you can see, the **kube-dns** service in the **kube-system**
namespace resolves to 2a0a:e5c0:13:e2::a, which is exactly what we
have configured.
At the moment, there is also an etherpad test service
named "ungleich-etherpad" running:
```
% kubectl get svc -l app=ungleichetherpad
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ungleich-etherpad ClusterIP 2a0a:e5c0:13:e2::b7db <none> 9001/TCP 3d19h
```
Let's first verify that it resolves:
```
% dig +short ungleich-etherpad.default.svc.k8s.place7.ungleich.ch aaaa
2a0a:e5c0:13:e2::b7db
```
And if that works, well, then we should also be able to access the
service itself!
```
% curl -I http://ungleich-etherpad.default.svc.k8s.place7.ungleich.ch:9001/
HTTP/1.1 200 OK
X-Powered-By: Express
X-UA-Compatible: IE=Edge,chrome=1
Referrer-Policy: same-origin
Content-Type: text/html; charset=utf-8
Content-Length: 6039
ETag: W/"1797-Dq3+mr7XP0PQshikMNRpm5RSkGA"
Set-Cookie: express_sid=s%3AZGKdDe3FN1v5UPcS-7rsZW7CeloPrQ7p.VaL1V0M4780TBm8bT9hPVQMWPX5Lcte%2BzotO9Lsejlk; Path=/; HttpOnly; SameSite=Lax
Date: Sun, 13 Jun 2021 18:36:23 GMT
Connection: keep-alive
Keep-Alive: timeout=5
```
(attention, this is a test service and might not be running when you
read this article at a later time)
## IPv6 vs. IPv4
Could we have achived the same with IPv4? The answere here is "maybe":
If the kubernetes service is reachable from globally reachable
nameservers via IPv4, then the answer is yes. This could be done via
public IPv4 addresses in the kubernetes cluster, via tunnels, VPNs,
etc.
However, generally speaking, the DNS service of a
kubernetes cluster running on RFC1918 IP addresses, is probably not
reachable from globally reachable DNS servers by default.
For IPv6 the case is a bit different: we are using globally reachable
IPv6 addresses in our k8s clusters, so they can potentially be
reachable without the need of any tunnel or whatsoever. Firewalling
and network policies can obviously prevent access, but if the IP
addresses are properly routed, they will be accessible from the public
Internet.
And this makes things much easier for DNS servers, which are also
having IPv6 connectivity.
The following pictures shows the practical difference between the two
approaches:
![](/u/image/k8s-v6-v4-dns.png)
## Does this make sense?
That clearly depends on your use-case. If you want your service DNS
records to be publicly accessible, then the clear answer is yes.
If your cluster services are intended to be internal only
(see [previous blog post](/u/blog/kubernetes-without-ingress/), then
exposing the DNS service to the world might not be the best option.
## Note on security
CoreDNS inside kubernetes is by default configured to allow resolving
for *any* client that can reach it. Thus if you make your kube-dns
service world reachable, you also turn it into an open resolver.
At the time of writing this blog article, the following coredns
configuration **does NOT** correctly block requests:
```
Corefile: |
.:53 {
acl k8s.place7.ungleich.ch {
allow net ::/0
}
acl . {
allow net 2a0a:e5c0:13::/48
block
}
forward . /etc/resolv.conf {
max_concurrent 1000
}
...
```
Until this is solved, we recommend to place a firewall before your
public kube-dns service to only allow requests from the forwarding DNS
servers.
## More of this
We are discussing
kubernetes and IPv6 related topics in
**the #hacking:ungleich.ch Matrix channel**
([you can signup here if you don't have an
account](https://chat.with.ungleich.ch)) and will post more about our
k8s journey in this blog. Stay tuned!

View file

@ -0,0 +1,122 @@
title: Kubernetes Network planning with IPv6
---
pub_date: 2021-06-26
---
author: ungleich
---
twitter_handle: ungleich
---
_hidden: no
---
_discoverable: no
---
abstract:
Learn which networks are good to use with kubernetes
---
body:
## Introduction
While IPv6 has a huge address space, you will need to specify a
**podCidr** (the network for the pods) and a **serviceCidr** (the
network for the services) for kubernetes. In this blog article we show
our findings and give a recommendation on what are the "most sensible"
networks to use for kubernetes.
## TL;DR
## Kubernetes limitations
In a typical IPv6 network, you would "just assign a /64" to anything
that needs to be a network. It is a bit the IPv6-no-brainer way of
handling networking.
However, kubernetes has a limitation:
[the serviceCidr cannot be bigger than a /108 at the
moment](https://github.com/kubernetes/kubernetes/pull/90115).
This is something very atypical for the IPv6 world, but nothing we
cannot handle. There are various pull requests and issues to fix this
behaviour on github, some of them listed below:
* https://github.com/kubernetes/enhancements/pull/1534
* https://github.com/kubernetes/kubernetes/pull/79993
* https://github.com/kubernetes/kubernetes/pull/90115 (this one is
quite interesting to read)
That said, it is possible to use a /64 for the **podCidr**.
## The "correct way" without the /108 limitation
If kubernetes did not have this limitation, our recommendation would
be to use one /64 for the podCidr and one /64 for the serviceCidr. If
in the future the limitations of kubernetes have been lifted, skip
reading this article and just use two /64's.
Do not be tempted to suggest making /108's the default, even if they
"have enough space", because using /64's allows you to stay in much
easier network plans.
## Sanity checking the /108
To be able to plan kubernetes clusters, it is important to know where
they should live, especially if you plan having a lot of kubernetes
clusters. Let's have a short look at the /108 network limitation:
A /108 allows 20 bit to be used for generating addresses, or a total
of 1048576 hosts. This is probably enough for the number of services
in a cluster. Now, can we be consistent and also use a /108 for the
podCidr? Let's assume for the moment that we do exactly that, so we
run a maximum of 1048576 pods at the same time. Assuming each service
consumes on average 4 pods, this would allow one to run 262144
services.
Assuming each pod uses around 0.1 CPUs and 100Mi RAM, if all pods were
to run at the same time, you would need ca. 100'000 CPUs and 100 TB
RAM. Assuming further that each node contains at maximum 128 CPUs and
at maximum 1 TB RAM (quite powerful servers), we would need more than
750 servers just for the CPUs.
So we can reason that **we can** run kubernetes clusters of quite some
size even with a **podCidr of /108**.
## Organising /108's
Let's assume that we organise all our kubernetes clusters in a single
/64, like 2001:db8:1:2::/64, which looks like this:
```
% sipcalc 2001:db8:1:2::/64
-[ipv6 : 2001:db8:1:2::/64] - 0
[IPV6 INFO]
Expanded Address - 2001:0db8:0001:0002:0000:0000:0000:0000
Compressed address - 2001:db8:1:2::
Subnet prefix (masked) - 2001:db8:1:2:0:0:0:0/64
Address ID (masked) - 0:0:0:0:0:0:0:0/64
Prefix address - ffff:ffff:ffff:ffff:0:0:0:0
Prefix length - 64
Address type - Aggregatable Global Unicast Addresses
Network range - 2001:0db8:0001:0002:0000:0000:0000:0000 -
2001:0db8:0001:0002:ffff:ffff:ffff:ffff
```
A /108 network on the other hand looks like this:
```
% sipcalc 2001:db8:1:2::/108
-[ipv6 : 2001:db8:1:2::/108] - 0
[IPV6 INFO]
Expanded Address - 2001:0db8:0001:0002:0000:0000:0000:0000
Compressed address - 2001:db8:1:2::
Subnet prefix (masked) - 2001:db8:1:2:0:0:0:0/108
Address ID (masked) - 0:0:0:0:0:0:0:0/108
Prefix address - ffff:ffff:ffff:ffff:ffff:ffff:fff0:0
Prefix length - 108
Address type - Aggregatable Global Unicast Addresses
Network range - 2001:0db8:0001:0002:0000:0000:0000:0000 -
2001:0db8:0001:0002:0000:0000:000f:ffff
```
Assuming for a moment that we assign a /108, this looks as follows:

View file

@ -0,0 +1,70 @@
title: ungleich production cluster #1
---
pub_date: 2021-07-05
---
author: ungleich
---
twitter_handle: ungleich
---
_hidden: no
---
_discoverable: no
---
abstract:
In this blog article we describe our way to our first production
kubernetes cluster.
---
body:
## Introduction
This article is WIP to describe all steps required for our first
production kubernetes cluster and the services that we run in it.
## Setup
### Bootstrapping
* All nodes are running [Alpine Linux](https://alpinelinux.org)
* All nodes are configured using [cdist](https://cdi.st)
* Mainly installing kubeadm, kubectl, crio *and* docker
* At the moment we try to use crio
* The cluster is initalised using **kubeadm init --config
k8s/c2/kubeadm.yaml** from the [ungleich-k8s repo](https://code.ungleich.ch/ungleich-public/ungleich-k8s)
### CNI/Networking
* Calico is installed using **kubectl apply -f
cni-calico/calico.yaml** from the [ungleich-k8s
repo](https://code.ungleich.ch/ungleich-public/ungleich-k8s)
* Installing calicoctl using **kubectl apply -f
https://docs.projectcalico.org/manifests/calicoctl.yaml**
* Aliasing calicoctl: **alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl"**
* All nodes BGP peer with our infrastructure using **calicoctl create -f - < cni-calico/bgp-c2.yaml**
### Persistent Volume Claim support
* Provided by rook
* Using customized manifests to support IPv6 from ungleich-k8s
```
for yaml in crds common operator cluster storageclass-cephfs storageclass-rbd toolbox; do
kubectl apply -f ${yaml}.yaml
done
```
### Flux
Starting with the 2nd cluster?
## Follow up
If you are interesting in continuing the discussion,
we are there for you in
**the #kubernetes:ungleich.ch Matrix channel**
[you can signup here if you don't have an
account](https://chat.with.ungleich.ch).
Or if you are interested in an IPv6 only kubernetes cluster,
drop a mail to **support**-at-**ungleich.ch**.

View file

@ -0,0 +1,201 @@
title: Building Ingress-less Kubernetes Clusters
---
pub_date: 2021-06-09
---
author: ungleich
---
twitter_handle: ungleich
---
_hidden: no
---
_discoverable: yes
---
abstract:
---
body:
## Introduction
On [our journey to build and define IPv6 only kubernetes
clusters](https://www.nico.schottelius.org/blog/k8s-ipv6-only-cluster/)
we came accross some principles that seem awkward in the IPv6 only
world. Let us today have a look at the *LoadBalancer* and *Ingress*
concepts.
## Ingress
Let's have a look at the [Ingress
definition](https://kubernetes.io/docs/concepts/services-networking/ingress/)
definiton from the kubernetes website:
```
Ingress exposes HTTP and HTTPS routes from outside the cluster to
services within the cluster. Traffic routing is controlled by rules
defined on the Ingress resource.
```
So the ingress basically routes from outside to inside. But, in the
IPv6 world, services are already publicly reachable. It just
depends on your network policy.
### Update 2021-06-13: Ingress vs. Service
As some people pointed out (thanks a lot!), a public service is
**not the same** as an Ingress. Ingress has also the possibility to
route based on layer 7 information like the path, domain name, etc.
However, if all of the traffic from an Ingress points to a single
IPv6 HTTP/HTTPS Service, effectively the IPv6 service will do the
same, with one hop less.
## Services
Let's have a look at how services in IPv6 only clusters look like:
```
% kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
etherpad ClusterIP 2a0a:e5c0:13:e2::a94b <none> 9001/TCP 19h
nginx-service ClusterIP 2a0a:e5c0:13:e2::3607 <none> 80/TCP 43h
postgres ClusterIP 2a0a:e5c0:13:e2::c9e0 <none> 5432/TCP 19h
...
```
All these services are world reachable, depending on your network
policy.
## ServiceTypes
While we are at looking at the k8s primitives, let's have a closer
look at the **Service**, specifically at 3 of the **ServiceTypes**
supported by k8s, including it's definition:
### ClusterIP
The k8s website says
```
Exposes the Service on a cluster-internal IP. Choosing this value
makes the Service only reachable from within the cluster. This is the
default ServiceType.
```
So in the context of IPv6, this sounds wrong. There is nothing that
makes an global IPv6 address be "internal", besides possible network
policies. The concept is probably coming from the strict difference of
RFC1918 space usually used in k8s clusters and not public IPv4.
This difference does not make a lot of sense in the IPv6 world though.
Seeing **services as public by default**, makes much more sense.
And simplifies your clusters a lot.
### NodePort
Let's first have a look at the definition again:
```
Exposes the Service on each Node's IP at a static port (the
NodePort). A ClusterIP Service, to which the NodePort Service routes,
is automatically created. You'll be able to contact the NodePort
Service, from outside the cluster, by requesting <NodeIP>:<NodePort>.
```
Conceptually this can be similarily utilised in the IPv6 only world
like it does in the IPv4 world. However given that there are enough
addresses available with IPv6, this might not be such an interesting
ServiceType anymore.
### LoadBalancer
Before we have a look at this type, let's take some steps back
first to ...
## ... Load Balancing
There are a variety of possibilities to do load balancing. From simple
round robin, to ECMP based load balancing, to application aware,
potentially weighted load balancing.
So for load balancing, there is usually more than one solution and
there is likely not one size fits all.
So with this said, let.s have a look at the
**ServiceType LoadBalancer** definition:
```
Exposes the Service externally using a cloud provider's load
balancer. NodePort and ClusterIP Services, to which the external load
balancer routes, are automatically created.
```
So whatever the cloud provider offers, can be used, and that is a good
thing. However, let's have a look at how you get load balancing for
free in IPv6 only clusters:
## Load Balancing in IPv6 only clusters
So what is the most easy way of reliable load balancing in network?
[ECMP (equal cost multi path)](https://en.wikipedia.org/wiki/Equal-cost_multi-path_routing)
comes to the mind right away. Given that
kubernetes nodes can BGP peer with the network (upstream or the
switches), this basically gives load balancing to the world for free:
```
[ The Internet ]
|
[ k8s-node-1 ]-----------[ network ]-----------[ k8s-node-n]
[ ECMP ]
|
[ k8s-node-2]
```
In the real world on a bird based BGP upstream router
this looks as follows:
```
[18:13:02] red.place7:~# birdc show route
BIRD 2.0.7 ready.
Table master6:
...
2a0a:e5c0:13:e2::/108 unicast [place7-server1 2021-06-07] * (100) [AS65534i]
via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 on eth0
unicast [place7-server4 2021-06-08] (100) [AS65534i]
via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 on eth0
unicast [place7-server2 2021-06-07] (100) [AS65534i]
via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc on eth0
unicast [place7-server3 2021-06-07] (100) [AS65534i]
via 2a0a:e5c0:13:0:224:81ff:fee0:db7a on eth0
...
```
Which results into the following kernel route:
```
2a0a:e5c0:13:e2::/108 proto bird metric 32
nexthop via 2a0a:e5c0:13:0:224:81ff:fee0:db7a dev eth0 weight 1
nexthop via 2a0a:e5c0:13:0:225:b3ff:fe20:3554 dev eth0 weight 1
nexthop via 2a0a:e5c0:13:0:225:b3ff:fe20:3564 dev eth0 weight 1
nexthop via 2a0a:e5c0:13:0:225:b3ff:fe20:38cc dev eth0 weight 1 pref medium
```
## TL;DR
We know, a TL;DR at the end is not the right thing to do, but hey, we
are at ungleich, aren't we?
In a nutshell, with IPv6 the concept of **Ingress**,
**Service** and the **LoadBalancer** ServiceType
types need to be revised, as IPv6 allows direct access without having
to jump through hoops.
If you are interesting in continuing the discussion,
we are there for you in
**the #hacking:ungleich.ch Matrix channel**
[you can signup here if you don't have an
account](https://chat.with.ungleich.ch).
Or if you are interested in an IPv6 only kubernetes cluster,
drop a mail to **support**-at-**ungleich.ch**.

View file

@ -0,0 +1,32 @@
title: Building stateless redundant IPv6 routers
---
pub_date: 2021-04-21
---
author: ungleich virtualisation team
---
twitter_handle: ungleich
---
_hidden: no
---
_discoverable: no
---
abstract:
It's time for IPv6 in docker, too.
---
body:
```
interface eth1.2
{
AdvSendAdvert on;
MinRtrAdvInterval 3;
MaxRtrAdvInterval 5;
AdvDefaultLifetime 10;
prefix 2a0a:e5c0:0:0::/64 { };
prefix 2a0a:e5c0:0:10::/64 { };
RDNSS 2a0a:e5c0:0:a::a 2a0a:e5c0:0:a::b { AdvRDNSSLifetime 6000; };
DNSSL place5.ungleich.ch { AdvDNSSLLifetime 6000; } ;
};
```

View file

@ -1,4 +1,4 @@
title: Accessing IPv4 only hosts via IPv4
title: Accessing IPv4 only hosts via IPv6
---
pub_date: 2021-02-28
---

View file

@ -0,0 +1,110 @@
_discoverable: no
---
_hidden: no
---
title: ungleich SLA levels
---
subtitle: ungleich service level agreements
---
description1:
What is the right SLA (service level agreement) for you? At ungleich
we know that every organisation has individual needs and resources.
Depending on your need, we offer different types of service level
agreements.
## The standard SLA
If not otherwise specified in the product or service you acquired from
us, the standard SLA will apply. This SLA covers standard operations
and is suitable for non-critical deployments. The standard SLA covers:
* Target uptime of all services: 99.9%
* Service level: best effort
* Included for all products
* Support via support@ungleich.ch (answered 9-17 on work days)
* Individual Development and Support available at standard rate of 220 CHF/h
* No telephone support
---
feature1_title: Bronze SLA
---
feature1_text:
The business SLA is suited for running regular applications with a
focus of business continuity and individual support. Compared to the
standard SLA it **guarantees you responses within 5 hours** on work
days. You also can **reach our staff at extended** hours.
---
feature2_title: Enterprise SLA
---
feature2_text:
The Enterprise SLA is right for you if you need high availability, but
you don't require instant reaction times from our team.
How this works:
* All services are setup in a high availability setup (additional
charges for resources apply)
* The target uptime of services: 99.99%
---
feature3_title: High Availability (HA) SLA
---
feature3_text:
If your application is mission critical, this is the right SLA for
you. The **HA SLA** guarantees high availability, multi location
deployments with cross-datacenter backups and fast reaction times
on 24 hours per day.
---
offer1_title: Business SLA
---
offer1_text:
* Target uptime of all services: 99.9%
* Service level: guaranteed reaction within 1 business day
* Development/Support (need to phrase this well): 180 CHF/h
* Telephone support (8-18 work days)
* Mail support (8-18 work days)
* Optional out of business hours hotline (360 CHF/h)
* 3'000 CHF/6 months
---
offer1_link: https://ungleich.ch/u/contact/
---
offer2_title: Enterprise SLA
---
offer2_text:
** Requires High availability setup for all services with separate pricing
* Service level: reaction within 4 hours
* Telephone support (24x7 work days)
* Services are provided in multiple data centers
* Included out of business hours hotline (180 CHF/h)
* 18'000 CHF/6 months
---
offer2_link: https://ungleich.ch/u/contact/
---
offer3_title: HA SLA
---
offer3_text:
* Uptime guarantees >= 99.99%
* Ticketing system reaction time < 3h
* 24x7 telephone support
* Applications running in multiple data centers
* Minimum monthly fee: 3000 CHF (according to individual service definition)
Individual pricing. Contact us on support@ungleich.ch for an indivual
quote and we will get back to you.
---
offer3_link: https://ungleich.ch/u/contact/

View file

@ -58,6 +58,15 @@ Checkout the [SBB
page](https://www.sbb.ch/de/kaufen/pages/fahrplan/fahrplan.xhtml?von=Zurich&nach=Diesbach-Betschwanden)
for the next train.
The address is:
```
Hacking Villa
Hauptstrasse 28
8777 Diesbach
Switzerland
```
---
content1_image: hacking-villa-diesbach.jpg
---

View file

@ -45,6 +45,16 @@ Specifically for learning new technologies and to exchange knowledge
we created the **Hacking & Learning channel** which can be found at
**#hacking-and-learning:ungleich.ch**.
## Kubernetes
Recently (in 2021) we started to run Kubernetes cluster at
ungleich. We share our experiences in **#kubernetes:ungleich.ch**.
## Ceph
To exchange experiences and trouble shooting for ceph, we are running
**#ceph:ungleich.ch**.
## cdist
We meet for cdist discussions about using, developing and more
@ -57,7 +67,7 @@ We discuss topics related to sustainability in
## More channels
* The main / hangout channel is **o#town-square:ungleich.ch** (also bridged
* The main / hangout channel is **#town-square:ungleich.ch** (also bridged
to Freenode IRC as #ungleich and
[discord](https://discord.com/channels/706144469925363773/706144469925363776))
* The bi-yearly hackathon Hack4Glarus can be found in