blog++: uncloud metadata

This commit is contained in:
Nico Schottelius 2020-01-15 11:52:19 +01:00
parent 95fe4eb9d5
commit c339bc86d7
3 changed files with 141 additions and 6 deletions

View file

@ -6,7 +6,7 @@ author: ungleich virtualisation team
---
twitter_handle: ungleich
---
_hidden: yes
_hidden: no
---
_discoverable: yes
---
@ -16,16 +16,17 @@ Data begins to accumulate
body:
This time we describe how to
store information in a database.
store information in a database and why we selected etcd as the
primary database.
The previous time we described
[how to generate MAC
addresses(../how-to-build-an-openstack-alternative-step-3-automating-mac-addresses/),
addresses](../how-to-build-an-openstack-alternative-step-3-automating-mac-addresses/),
a key element of uncloud.
## More Data
We know have a couple of running VMs, we want to remember which VMs
We now have a couple of running VMs, we want to remember which VMs
are running and also add more information. Who owns a VM? And later
also where is the VM running.
@ -36,7 +37,8 @@ The main reason for it is that we don't want to add a single point of
failure into uncloud and we don't need guarantees provided by
standard SQL.
An alternative we still consider to etcd is postgresql, which also
An alternative we still consider is postgresql. While it is not
inherently distributed (at all), it also
supports storing JSON and has quite a sophisticated messaging system.
## Refactoring: phasing in a database

View file

@ -0,0 +1,130 @@
title: How to build an OpenStack alternative: Step 5, adding metadata
---
pub_date: 2020-01-15
---
author: ungleich virtualisation team
---
twitter_handle: ungleich
---
_hidden: no
---
_discoverable: yes
---
abstract:
Let the VMs get information about themselves
---
body:
This time we describe how virtual machines can get information about
themselves like which ssh keys should have access to it.
The previous time we
[added a data base to
uncloud](../how-to-build-an-openstack-alternative-step-4-adding-a-database/).
## Motivation
If we were to start VMs without a metadata service, all of the VMs
would be looking identical and would not be able to know, whom to
allow access to it.
To customise a VM or to make it usable, we need to tell it who has
access to it and potentially inject even more information.
## Metadata service: how others do it
Enters the metadata service. OpenNebula solves this problem quite
nicely by attaching a virtual cdrom to the VMs. That cdrom contains
only one file, `context.sh`. This file contains information about
* networking
* ssh keys
OpenStack with cloud-init on the other side uses an HTTP based
service that is found on the address `http://169.254.169.254/`.
Both schemes come with disadvantages that we don't want to replicate
in uncloud:
In the opennebula case changing metadata information while the VM is
running requires to create a new CDROM and if that one is still
mounted, the VM might not get the up-to-date information. This is a
bit of a theorethical case, as the metadata is rarely re-used after
booting.
However changing the information provided in the context.sh inside the
ISO always requires to generate a new ISO. While technical possible,
not very elegant.
The OpenStack based approach has (from our point of view) a much
bigger problem: it relies on IPv4. VMs running on uncloud primarily
run IPv6 and should function without any IPv4 stack.
The motivation for using the 169.254.0.0/16 network is clear: it works
without having an IP address management system in place.
## Solving it the smart way
So it seems like the general approach of OpenStack/cloud-init is
actually quite elegant, if it wasn't forcing IPv4.
In the IPv6 world, we always have link local addresses in the
**fe80::/10** network. Should we just replace the OpenStack approach
with IPv6?
We don't think so, it has the same argument in favor for IPv4 networks
that we have in favor for IPv6 networks.
Instead, we suggest to add a simple change to the OpenStack approach:
Use http://metadata instead of using an IP address.
## http://metadata
So how should this work and why is this better than using
http://169.254.169.254/?
Using a name, it doesn't matter whether the VM is on an IPv4 ore
IPv6 network.
Using just the hostname, not an FQDN (i.e. metadata.example.com) makes
it portable.
The name can be resolved via various methods:
* uncloud: it will be delivered by DNS
* openstack: either via DNS (like uncloud) or if there is no IPAM, it
can be statically set in /etc/hosts
In the DNS resolving case, this actually gets even more interesting,
because we can use the **DNS search path**. So while the client tries
to resolve the hostname **metadata**, the underlying resolver library
will also look for **metadata.example.com**, if example.com is in the
search path.
## uncloud implementation
In uncloud we have implemented a sample [metadata
service](https://code.ungleich.ch/uncloud/uncloud/tree/master/uncloud/metadata).
However IPAM (i.e. router advertisements) and DNS servers are not part
of uncloud and can be used from the regular system infrastructure.
In one of the next versions we plan to include helpers that allow you
to bootstrap IPAM and DNS easily.
## Status
At this point uncloud can create VMs and the VMs can get the ssh keys
that should have access from the metadata service.
With this latest add-on uncloud gets near the range of a usable
prototype. A lot of things will probably need to be refactored in the
future, but at the moment uncloud supports already:
* creating VMs (using qemu)
* securing the VM network (using nftables)
* generating unique mac addresses (uncloud python code)
* storing information in a distributed database (using pytho-etcd3 and
etcd)
* providing basic metadata inforamtion (uncloud python code)

View file

@ -1,6 +1,9 @@
- step 4 = database!
x step 4 = database!
- image contextualisation
- dnsmasq
- metadata
- for getting keys
- user metadata
- user authentication
- otp or ldap or plain file
- for registering keys