diff --git a/content/u/blog/how-to-build-an-openstack-alternative-step-1/contents.lr b/content/u/blog/how-to-build-an-openstack-alternative-step-1/contents.lr new file mode 100644 index 0000000..e251dc0 --- /dev/null +++ b/content/u/blog/how-to-build-an-openstack-alternative-step-1/contents.lr @@ -0,0 +1,197 @@ +title: How to build an OpenStack alternative: Step 1, the prototype +--- +pub_date: 2020-01-11 +--- +author: ungleich virtualisation team +--- +twitter_handle: ungleich +--- +_hidden: no +--- +_discoverable: yes +--- +abstract: +The step by step guide for doing it yourself +--- +body: + +In this article we describe a first + +## Find out what you need + +When we say building an alternative to OpenStack, we have something +specific in our mind. This might be different from what you think +OpenStack is for. For us it is running a lot of virtual machines for +customers with a lot of storage attached. With self service and +automated payments. + +All code I refer to in this article can be found on +[code.ungleich.ch](https://code.ungleich.ch/uncloud/uncloud/tree/master/uncloud/hack/hackcloud). + +## Creating a network + +The current setup at [Data Center +Light](/u/projects/data-center-light) relies heavily on VLANs. VLANs +however have a similar problem as IPv4 addresses: there are not that +many of them. So for our Openstack replacement we decided to go with +[VXLANs](https://en.wikipedia.org/wiki/Virtual_Extensible_LAN) +instead. We also considered +[SRV6](https://www.segment-routing.net/tutorials/2017-12-05-srv6-introduction/), +however we did not see a advantage for our use case. In fact, VXLANs +seems to be much simpler. + +So before running a VM, we create a new VXLAN device and add it to a +bridge. This roughly looks as follows: + +``` +netid=100 +dev=eth0 + +vxlandev=vxlan${netid} +bridgedev=br${netid} + +# Create the vxlan device +ip -6 link add ${vxlandev} type vxlan \ + id ${netid} \ + dstport 4789 \ + group ff05::${netid} \ + dev ${dev} \ + ttl 5 + +ip link set ${vxlandev} up + +# Create the bridge +ip link add ${bridgedev} type bridge +ip link set ${bridgedev} up + +# Add the vxlan device into the bridge +ip link set ${vxlandev} master ${bridgedev} up +``` + +As you can see, we are using IPv6 multicast underlying the VXLAN, +which is very practical in an IPv6 first data center. + +## IP address management (IPAM) + +Speaking of IPv6 first, all VMs in our new setup will again be IPv6 +only and IPv4 addresses will be mapped to it via NAT64. This is very +similar to what you see at AWS, just that AWS uses +[RFC1918](https://tools.ietf.org/html/rfc1918) private IPv4 space +instead of [global unique IPv6 +addresses](https://tools.ietf.org/html/rfc3587), which we do. + +The advantage of using IPv6 here is that you will never ever have a +collision and that your VM is very clean: no need to think about IPv4 +firewall rules, you only need to configure IPv6 settings. + +In the IPv6 world, we use router advertisements as an alternative to +DHCP in the IPv4 world. This has the advantage that no state is kept +on the server. + +To enable our IPAM, we add an IPv6 address to our bridge and enable +the radvd daemon: + +``` +ip addr add ${ip} dev ${bridgedev} +radvd -C ./radvd.conf -n -p ./radvdpid +``` + +A sample radvd configuration we used for testing looks like this: + +``` +interface br100 +{ + AdvSendAdvert on; + MinRtrAdvInterval 3; + MaxRtrAdvInterval 5; + AdvDefaultLifetime 3600; + + prefix 2a0a:e5c1:111:888::/64 { + }; + + RDNSS 2a0a:e5c0::3 2a0a:e5c0::4 { AdvRDNSSLifetime 6000; }; + DNSSL place7.ungleich.ch { AdvDNSSLLifetime 6000; } ; +}; +``` + +With this, we are ready to spawn a VM! + +## Create a VM + +The current setup at Data Center Light uses libvirtd for creating +VMs. This is problematic, because libvirtd is not very reliabe: +sometimes it stops to answer `virsh` commands or begins to use 100% +CPU and needs to be killed and restarted regularly. We have seen this +behaviour on CentOS 5, CentOS 6, Debian 8 and Devuan 9. + +So in our version, we skip libvirt and run qemu directly. It turns out +that this is actually not that hard and can be done using the +following script: + + +``` +vmid=$1; shift + +qemu=/usr/bin/qemu-system-x86_64 + +accel=kvm +#accel=tcg + +memory=1024 +cores=2 +uuid=732e08c7-84f8-4d43-9571-263db4f80080 + +export bridge=br100 + +$qemu -name uc${vmid} \ + -machine pc,accel=${accel} \ + -m ${memory} \ + -smp ${cores} \ + -uuid ${uuid} \ + -drive file=alpine-virt-3.11.2-x86_64.iso,media=cdrom \ + -netdev tap,id=netmain,script=./ifup.sh \ + -device virtio-net-pci,netdev=netmain,id=net0,mac=02:00:f0:a9:c4:4e +``` + +This starts a VM with a hard coded mac address using KVM +acceleration. We give the VM 2 cores and assign it an UUID so that we +can easily find it again later. For testing, we have attached an +[Alpine Linux ISO](https://alpinelinux.org/). + +The interesting part is however the network part. We create a virtio +based network card and execute `ifup.sh` after qemu has been started. + +The ifup.sh script looks as follows: + +``` +dev=$1; shift + +# bridge is setup from outside +ip link set dev "$dev" master ${bridge} +ip link set dev "$dev" up +``` + +It basically adds the tap device to the previously created bridge. + +## That's all there is + +Only using above steps we spawned a test VM on a test machine that is +reachable at `2a0a:e5c1:111:888:0:f0ff:fea9:c44e`, world wide. If our +test machine is on, you should be able to reach it from anywhere in +the world. + +Obviously this is not a full OpenStack replacement. However we wanted +to share the small steps that we take for creating it. And we really +like running a virtual machine hosting and wanted to show you how +much fun it can be. + +## Next step + +A lot of things in the above example are hard code and aren't usable +for customers directly. In the next step we will generalise some of +the above functions to get more and more nearby to provide a fully +usable OpenStack alternative. + +If you are interested in this topic, you can join us on the [ungleich +chat](https://chat.ungleich.ch), the full development of our +alternative is open source. diff --git a/content/u/blog/uncloud-next b/content/u/blog/uncloud-next new file mode 100644 index 0000000..4fc701b --- /dev/null +++ b/content/u/blog/uncloud-next @@ -0,0 +1 @@ +- how to secure the network