title: How to build an OpenStack alternative: Step 4, adding a database --- pub_date: 2020-01-14 --- author: ungleich virtualisation team --- twitter_handle: ungleich --- _hidden: yes --- _discoverable: yes --- abstract: Data begins to accumulate --- body: This time we describe how to store information in a database. The previous time we described [how to generate MAC addresses(../how-to-build-an-openstack-alternative-step-3-automating-mac-addresses/), a key element of uncloud. ## More Data We know have a couple of running VMs, we want to remember which VMs are running and also add more information. Who owns a VM? And later also where is the VM running. ## Database We decided to use [etcd](https://etcd.io/) as our primary database. The main reason for it is that we don't want to add a single point of failure into uncloud and we don't need guarantees provided by standard SQL. An alternative we still consider to etcd is postgresql, which also supports storing JSON and has quite a sophisticated messaging system. ## Refactoring: phasing in a database So far we used a couple of python and shell scripts to create the base of uncloud. Now that things become a bit more serious, we needed to refactor our code. Shell and python scripts are cleaned up and become python a proper python module, which we lovely call `uncloud.hack`. ## Python, ETCD and JSON We decided to use [python-etcd3](https://python-etcd3.readthedocs.io/) to access etcd from the python world, as it supports the API version 3. For the data format we decided to use JSON, as it is easy to read. Each VM is identified by a random UUID, so we don't need to store a counter for VMs. ## Status At this point uncloud can create VMs and the VMs are registered in etcd as the database. So while we don't have logic yet for (automatic) VM migration, the information about VMs is already stored in a distributed database. So if one of our hosts vanishes, we can in theory already redeploy the existing VMs.