++blog
This commit is contained in:
parent
eafcb97d87
commit
2f1d043281
2 changed files with 198 additions and 0 deletions
|
@ -0,0 +1,197 @@
|
||||||
|
title: How to build an OpenStack alternative: Step 1, the prototype
|
||||||
|
---
|
||||||
|
pub_date: 2020-01-11
|
||||||
|
---
|
||||||
|
author: ungleich virtualisation team
|
||||||
|
---
|
||||||
|
twitter_handle: ungleich
|
||||||
|
---
|
||||||
|
_hidden: no
|
||||||
|
---
|
||||||
|
_discoverable: yes
|
||||||
|
---
|
||||||
|
abstract:
|
||||||
|
The step by step guide for doing it yourself
|
||||||
|
---
|
||||||
|
body:
|
||||||
|
|
||||||
|
In this article we describe a first
|
||||||
|
|
||||||
|
## Find out what you need
|
||||||
|
|
||||||
|
When we say building an alternative to OpenStack, we have something
|
||||||
|
specific in our mind. This might be different from what you think
|
||||||
|
OpenStack is for. For us it is running a lot of virtual machines for
|
||||||
|
customers with a lot of storage attached. With self service and
|
||||||
|
automated payments.
|
||||||
|
|
||||||
|
All code I refer to in this article can be found on
|
||||||
|
[code.ungleich.ch](https://code.ungleich.ch/uncloud/uncloud/tree/master/uncloud/hack/hackcloud).
|
||||||
|
|
||||||
|
## Creating a network
|
||||||
|
|
||||||
|
The current setup at [Data Center
|
||||||
|
Light](/u/projects/data-center-light) relies heavily on VLANs. VLANs
|
||||||
|
however have a similar problem as IPv4 addresses: there are not that
|
||||||
|
many of them. So for our Openstack replacement we decided to go with
|
||||||
|
[VXLANs](https://en.wikipedia.org/wiki/Virtual_Extensible_LAN)
|
||||||
|
instead. We also considered
|
||||||
|
[SRV6](https://www.segment-routing.net/tutorials/2017-12-05-srv6-introduction/),
|
||||||
|
however we did not see a advantage for our use case. In fact, VXLANs
|
||||||
|
seems to be much simpler.
|
||||||
|
|
||||||
|
So before running a VM, we create a new VXLAN device and add it to a
|
||||||
|
bridge. This roughly looks as follows:
|
||||||
|
|
||||||
|
```
|
||||||
|
netid=100
|
||||||
|
dev=eth0
|
||||||
|
|
||||||
|
vxlandev=vxlan${netid}
|
||||||
|
bridgedev=br${netid}
|
||||||
|
|
||||||
|
# Create the vxlan device
|
||||||
|
ip -6 link add ${vxlandev} type vxlan \
|
||||||
|
id ${netid} \
|
||||||
|
dstport 4789 \
|
||||||
|
group ff05::${netid} \
|
||||||
|
dev ${dev} \
|
||||||
|
ttl 5
|
||||||
|
|
||||||
|
ip link set ${vxlandev} up
|
||||||
|
|
||||||
|
# Create the bridge
|
||||||
|
ip link add ${bridgedev} type bridge
|
||||||
|
ip link set ${bridgedev} up
|
||||||
|
|
||||||
|
# Add the vxlan device into the bridge
|
||||||
|
ip link set ${vxlandev} master ${bridgedev} up
|
||||||
|
```
|
||||||
|
|
||||||
|
As you can see, we are using IPv6 multicast underlying the VXLAN,
|
||||||
|
which is very practical in an IPv6 first data center.
|
||||||
|
|
||||||
|
## IP address management (IPAM)
|
||||||
|
|
||||||
|
Speaking of IPv6 first, all VMs in our new setup will again be IPv6
|
||||||
|
only and IPv4 addresses will be mapped to it via NAT64. This is very
|
||||||
|
similar to what you see at AWS, just that AWS uses
|
||||||
|
[RFC1918](https://tools.ietf.org/html/rfc1918) private IPv4 space
|
||||||
|
instead of [global unique IPv6
|
||||||
|
addresses](https://tools.ietf.org/html/rfc3587), which we do.
|
||||||
|
|
||||||
|
The advantage of using IPv6 here is that you will never ever have a
|
||||||
|
collision and that your VM is very clean: no need to think about IPv4
|
||||||
|
firewall rules, you only need to configure IPv6 settings.
|
||||||
|
|
||||||
|
In the IPv6 world, we use router advertisements as an alternative to
|
||||||
|
DHCP in the IPv4 world. This has the advantage that no state is kept
|
||||||
|
on the server.
|
||||||
|
|
||||||
|
To enable our IPAM, we add an IPv6 address to our bridge and enable
|
||||||
|
the radvd daemon:
|
||||||
|
|
||||||
|
```
|
||||||
|
ip addr add ${ip} dev ${bridgedev}
|
||||||
|
radvd -C ./radvd.conf -n -p ./radvdpid
|
||||||
|
```
|
||||||
|
|
||||||
|
A sample radvd configuration we used for testing looks like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
interface br100
|
||||||
|
{
|
||||||
|
AdvSendAdvert on;
|
||||||
|
MinRtrAdvInterval 3;
|
||||||
|
MaxRtrAdvInterval 5;
|
||||||
|
AdvDefaultLifetime 3600;
|
||||||
|
|
||||||
|
prefix 2a0a:e5c1:111:888::/64 {
|
||||||
|
};
|
||||||
|
|
||||||
|
RDNSS 2a0a:e5c0::3 2a0a:e5c0::4 { AdvRDNSSLifetime 6000; };
|
||||||
|
DNSSL place7.ungleich.ch { AdvDNSSLLifetime 6000; } ;
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
With this, we are ready to spawn a VM!
|
||||||
|
|
||||||
|
## Create a VM
|
||||||
|
|
||||||
|
The current setup at Data Center Light uses libvirtd for creating
|
||||||
|
VMs. This is problematic, because libvirtd is not very reliabe:
|
||||||
|
sometimes it stops to answer `virsh` commands or begins to use 100%
|
||||||
|
CPU and needs to be killed and restarted regularly. We have seen this
|
||||||
|
behaviour on CentOS 5, CentOS 6, Debian 8 and Devuan 9.
|
||||||
|
|
||||||
|
So in our version, we skip libvirt and run qemu directly. It turns out
|
||||||
|
that this is actually not that hard and can be done using the
|
||||||
|
following script:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
vmid=$1; shift
|
||||||
|
|
||||||
|
qemu=/usr/bin/qemu-system-x86_64
|
||||||
|
|
||||||
|
accel=kvm
|
||||||
|
#accel=tcg
|
||||||
|
|
||||||
|
memory=1024
|
||||||
|
cores=2
|
||||||
|
uuid=732e08c7-84f8-4d43-9571-263db4f80080
|
||||||
|
|
||||||
|
export bridge=br100
|
||||||
|
|
||||||
|
$qemu -name uc${vmid} \
|
||||||
|
-machine pc,accel=${accel} \
|
||||||
|
-m ${memory} \
|
||||||
|
-smp ${cores} \
|
||||||
|
-uuid ${uuid} \
|
||||||
|
-drive file=alpine-virt-3.11.2-x86_64.iso,media=cdrom \
|
||||||
|
-netdev tap,id=netmain,script=./ifup.sh \
|
||||||
|
-device virtio-net-pci,netdev=netmain,id=net0,mac=02:00:f0:a9:c4:4e
|
||||||
|
```
|
||||||
|
|
||||||
|
This starts a VM with a hard coded mac address using KVM
|
||||||
|
acceleration. We give the VM 2 cores and assign it an UUID so that we
|
||||||
|
can easily find it again later. For testing, we have attached an
|
||||||
|
[Alpine Linux ISO](https://alpinelinux.org/).
|
||||||
|
|
||||||
|
The interesting part is however the network part. We create a virtio
|
||||||
|
based network card and execute `ifup.sh` after qemu has been started.
|
||||||
|
|
||||||
|
The ifup.sh script looks as follows:
|
||||||
|
|
||||||
|
```
|
||||||
|
dev=$1; shift
|
||||||
|
|
||||||
|
# bridge is setup from outside
|
||||||
|
ip link set dev "$dev" master ${bridge}
|
||||||
|
ip link set dev "$dev" up
|
||||||
|
```
|
||||||
|
|
||||||
|
It basically adds the tap device to the previously created bridge.
|
||||||
|
|
||||||
|
## That's all there is
|
||||||
|
|
||||||
|
Only using above steps we spawned a test VM on a test machine that is
|
||||||
|
reachable at `2a0a:e5c1:111:888:0:f0ff:fea9:c44e`, world wide. If our
|
||||||
|
test machine is on, you should be able to reach it from anywhere in
|
||||||
|
the world.
|
||||||
|
|
||||||
|
Obviously this is not a full OpenStack replacement. However we wanted
|
||||||
|
to share the small steps that we take for creating it. And we really
|
||||||
|
like running a virtual machine hosting and wanted to show you how
|
||||||
|
much fun it can be.
|
||||||
|
|
||||||
|
## Next step
|
||||||
|
|
||||||
|
A lot of things in the above example are hard code and aren't usable
|
||||||
|
for customers directly. In the next step we will generalise some of
|
||||||
|
the above functions to get more and more nearby to provide a fully
|
||||||
|
usable OpenStack alternative.
|
||||||
|
|
||||||
|
If you are interested in this topic, you can join us on the [ungleich
|
||||||
|
chat](https://chat.ungleich.ch), the full development of our
|
||||||
|
alternative is open source.
|
1
content/u/blog/uncloud-next
Normal file
1
content/u/blog/uncloud-next
Normal file
|
@ -0,0 +1 @@
|
||||||
|
- how to secure the network
|
Loading…
Reference in a new issue