e324ef31a7
Signed-off-by: Nico Schottelius <nico@ikn.schottelius.org>
238 lines
11 KiB
Markdown
238 lines
11 KiB
Markdown
[[!meta title="Linux virtual machine software is a real pain"]]
|
|
|
|
This report is a about my todays experience with virtual machines.
|
|
|
|
## UML (user-mode-linux)
|
|
|
|
It began this morning, when I tried to setup a new virtual
|
|
machine with [User mode Linux](http://user-mode-linux.sourceforge.net/).
|
|
I could easily reuse an existing installation using
|
|
copy-on-write with the following command:
|
|
|
|
linux umid=vm4 uml_dir=/home/nico/vm/uml con1=pts ubda=/home/nico/vm/cow/vm4,/home/nico/vm/images/debian eth0=tuntap,,,192.168.4.1 mem=4096M
|
|
|
|
After I issued
|
|
|
|
apt-get update && apt-get dist-upgrade
|
|
|
|
in the virtual machine, it hung. It did not react to new ssh connections.
|
|
I've seen this behaviour quite often with **user mode Linux**, when I have
|
|
"a lot" of disk input/output. Ok, I wanted to use some kind of
|
|
framework for my virtual machines anyway, so for the time being,
|
|
let's forget about uml and try the libvirt+kvm.
|
|
|
|
## Libvirt
|
|
|
|
The [libvirt](http://libvirt.org/) project looks quite promising
|
|
from its documentation, especially in combination with
|
|
[virt-manager](http://virt-manager.org/). Trying to create a
|
|
new virtual machine with virt-manager is kind of strange, because
|
|
it insists of having an installation medium. Though, locating the
|
|
Debian live CD is not so difficult. But then came the big problem:
|
|
When I tried to create a new disk image, virt-manager just hung
|
|
for several minutes, without the host system doing anything.
|
|
Some time before I had massive problems using virt-manager and selecting
|
|
a different pool for the images, which caused several problems when
|
|
trying to start the VM.
|
|
|
|
But well, let's give [virsh](http://www.libvirt.org/apps.html) a try,
|
|
the command line utility to manage libvirt. Creating a new disk image
|
|
with virsh is pretty easy:
|
|
|
|
vol-create-as default jr.img 8G
|
|
|
|
A bit confusing is the fact that the **vol-create** command without
|
|
**-as** prefix expects a XML-file as input. Having a look at the
|
|
other create commands confirms guess:
|
|
|
|
ikn:~% LANG=C LC_ALL=C virsh help | grep create
|
|
create create a domain from an XML file
|
|
net-create create a network from an XML file
|
|
nodedev-create create a device defined by an XML file on the node
|
|
pool-create create a pool from an XML file
|
|
pool-create-as create a pool from a set of args
|
|
vol-create create a vol from an XML file
|
|
vol-create-from create a vol, using another volume as input
|
|
vol-create-as create a volume from a set of args
|
|
ikn:~% virsh --version
|
|
0.7.4
|
|
|
|
Some commands do not support creation from the command line, but
|
|
only from an XML-file, which makes virsh useless for interactive
|
|
and scripting use.
|
|
|
|
This brings me to the new kid on the block: ganeti
|
|
|
|
## Ganeti
|
|
|
|
When I first experienced problems with libvirt, some people pointed
|
|
me to [ganeti](http://code.google.com/p/ganeti/)
|
|
(to speak truth, it was one of the ganeti developers).
|
|
Until today I delayed this idea, but after the problems with libvirt
|
|
I decided to give ganeti-2.0.5-1 (Debian package) a try. First of all
|
|
I tried to follow the
|
|
[installation tutorial](http://ganeti-doc.googlecode.com/svn/ganeti-1.2/install.html)
|
|
referenced on the homepage, which is heavily orientated on using
|
|
[Xen](http://www.xen.org/) and [LVM](http://tldp.org/HOWTO/LVM-HOWTO/), both
|
|
of them I do not plan to use. Trying to get ganeti running, I was meeting
|
|
some interesting problems:
|
|
|
|
[11:26] tee:root# gnt-cluster init ganeti.schottelius.org
|
|
Failure: prerequisites not met for this operation:
|
|
This host's IP resolves to the private range (127.0.1.1). Please fix DNS or /etc/hosts.
|
|
|
|
This is described in the ganeti manual and easily fixed by commenting out the
|
|
relevant entry in ***/etc/hosts***:
|
|
|
|
[11:27] tee:root# grep tee.schottelius.org /etc/hosts
|
|
#127.0.1.1 tee.schottelius.org tee
|
|
|
|
After that I was a bit confused by ganeti not finding its cluster name:
|
|
|
|
[11:27] tee:root# gnt-cluster init ganeti.schottelius.org
|
|
Failure: can't resolve hostname 'ganeti.schottelius.org'
|
|
[11:28] tee:root# ping ganeti.schottelius.org
|
|
PING ganeti.schottelius.org (77.109.138.195) 56(84) bytes of data.
|
|
64 bytes from ganeti.schottelius.org (77.109.138.195): icmp_seq=1 ttl=64 time=0.026 ms
|
|
^C
|
|
--- ganeti.schottelius.org ping statistics ---
|
|
1 packets transmitted, 1 received, 0% packet loss, time 0ms
|
|
rtt min/avg/max/mdev = 0.026/0.026/0.026/0.000 ms
|
|
|
|
Retrying two times "solved" the problem, which is a bit confusing for me
|
|
as ganeti and ping both use the same resolver library. After that I met the
|
|
"no-lvm-problem":
|
|
|
|
[11:38] tee:root# gnt-cluster init -b br0 ganeti.schottelius.org
|
|
Failure: prerequisites not met for this operation:
|
|
Error: volume group 'xenvg' missing
|
|
specify --no-lvm-storage if you are not using lvm
|
|
|
|
Specifying the required parameter led me into a new problem:
|
|
|
|
[11:38] tee:root# gnt-cluster init -b br0 --no-lvm-storage ganeti.schottelius.org
|
|
Failure: prerequisites not met for this operation:
|
|
Invalid master netdev given (xen-br0): 'Device "xen-br0" does not exist.'
|
|
|
|
Which is interesting, **ganeti seems to ignore the bridge paramater -b**.
|
|
So, to use ganeti, I **renamed the bridge from br0 to xen-br0** in
|
|
***/etc/network/interfaces***:
|
|
|
|
auto xen-br0
|
|
iface xen-br0 inet manual
|
|
bridge_ports eth1
|
|
|
|
And finally I was able to initialise the ganeti cluster:
|
|
|
|
[15:06] tee:root# gnt-cluster init -b br0 --no-lvm-storage ganeti.schottelius.org
|
|
|
|
Then I tried to join the host into the cluster, which failed, but retrieving
|
|
status information also failed:
|
|
|
|
[15:06] tee:root# gnt-node add tee.schottelius.org
|
|
Node tee.schottelius.org already in the cluster (as tee.schottelius.org) - please retry with '--readd'
|
|
[15:07] tee:root# gnt-node list
|
|
Node DTotal DFree MTotal MNode MFree Pinst Sinst
|
|
tee.schottelius.org ? ? ? ? ? 0 0
|
|
|
|
Trying to re-add it, results in an error without an error message
|
|
and does not fix the problem:
|
|
|
|
[15:07] tee:root# gnt-node add --readd tee.schottelius.org
|
|
The authenticity of host 'tee.schottelius.org (77.109.138.222)' can't be established.
|
|
RSA key fingerprint is c7:d0:a8:32:ad:f0:9b:fa:1e:77:d5:1f:64:d8:9b:db.
|
|
Are you sure you want to continue connecting (yes/no)? yes
|
|
Thu Dec 31 15:08:23 2009 - INFO: Readding a node, the offline/drained flags were reset
|
|
Thu Dec 31 15:08:23 2009 - INFO: Node will be a master candidate
|
|
Failure: command execution error:
|
|
[15:08] tee:root#
|
|
[15:32] tee:root# gnt-node list
|
|
Node DTotal DFree MTotal MNode MFree Pinst Sinst
|
|
tee.schottelius.org ? ? ? ? ? 0 0
|
|
[15:34] tee:root#
|
|
|
|
At that point I was pointed to the
|
|
[more recent documentation](http://ganeti-doc.googlecode.com/svn/ganeti-2.0/install.html)
|
|
of ganeti and began from scratch:
|
|
|
|
[16:22] tee:vm# gnt-cluster destroy --yes-do-it
|
|
[16:23] tee:vm# gnt-cluster init --no-lvm-storage ganeti.schottelius.org
|
|
[16:26] tee:vm# gnt-node list
|
|
Node DTotal DFree MTotal MNode MFree Pinst Sinst
|
|
tee.schottelius.org ? ? ? ? ? 0 0
|
|
|
|
After double checking that the needed daemons are running
|
|
(/etc/init.d/ganeti restart), I got a good hint: One has to specify
|
|
the supervisor to use during initialisation:
|
|
|
|
[16:34] tee:vm# gnt-cluster destroy --yes-do-it
|
|
[16:35] tee:vm# gnt-cluster init --no-lvm-storage -t kvm ganeti.schottelius.org
|
|
[16:36] tee:vm# gnt-node list
|
|
Node DTotal DFree MTotal MNode MFree Pinst Sinst
|
|
tee.schottelius.org ? ? 19.6G 2.7G 17.6G 0 0
|
|
|
|
Now I tried to add a new virtual machine instance, which resulted in another
|
|
error:
|
|
|
|
[16:51] tee:vm# gnt-instance add -t file -s 4G -o debootstrap -n tee.schottelius.org jr.nachtbrand.ch
|
|
Failure: prerequisites not met for this operation:
|
|
Hypervisor parameter validation failed on node tee.schottelius.org: Instance kernel '/boot/vmlinuz-2.6-kvmU' not found or not a file
|
|
|
|
This seems to be some kind ganeti logic to have the kernel outside the
|
|
block device, which is similar to the user mode Linux approach. After linking one
|
|
of the host kernels and its initrd adding an instance succeeded:
|
|
|
|
[16:59] tee:/boot# ln -s vmlinuz-2.6.30-2-amd64 vmlinuz-2.6-kvmU
|
|
[16:59] tee:/boot# ln -s initrd.img-2.6.30-2-amd64 initrd-2.6-kvmU
|
|
[17:00] tee:vm# gnt-instance add -t file -s 4G -o debootstrap -n tee.schottelius.org jr.nachtbrand.ch
|
|
[17:01] tee:/boot# gnt-instance list
|
|
Instance Hypervisor OS Primary_node Status Memory
|
|
jr.nachtbrand.ch kvm debootstrap tee.schottelius.org running 128M
|
|
|
|
It is also correctly connected to the bridge, seen as valid by **gnt-os**
|
|
and **gnt-cluster verify** looks good:
|
|
|
|
[17:14] tee:/boot# brctl show
|
|
bridge name bridge id STP enabled interfaces
|
|
br0 8000.000000000000 no
|
|
xen-br0 8000.0015176a26f7 no eth1
|
|
tap4
|
|
[17:16] tee:/boot# gnt-os diagnose
|
|
OS: debootstrap [global status: valid]
|
|
Node: tee.schottelius.org, status: VALID (path: /usr/share/ganeti/os/debootstrap)
|
|
[17:17] tee:/boot# gnt-cluster verify
|
|
Thu Dec 31 17:17:24 2009 * Verifying global settings
|
|
Thu Dec 31 17:17:24 2009 * Gathering data (1 nodes)
|
|
Thu Dec 31 17:17:24 2009 * Verifying node tee.schottelius.org (master)
|
|
Thu Dec 31 17:17:24 2009 * Verifying instance jr.nachtbrand.ch
|
|
Thu Dec 31 17:17:24 2009 * Verifying orphan volumes
|
|
Thu Dec 31 17:17:24 2009 * Verifying remaining instances
|
|
Thu Dec 31 17:17:24 2009 * Verifying N+1 Memory redundancy
|
|
Thu Dec 31 17:17:24 2009 * Other Notes
|
|
Thu Dec 31 17:17:24 2009 - NOTICE: 1 non-redundant instance(s) found.
|
|
Thu Dec 31 17:17:24 2009 * Hooks Results
|
|
|
|
As specified in the documentation, I tried to connect to the console:
|
|
|
|
[17:28] tee:/boot# gnt-instance console jr.nachtbrand.ch
|
|
[17:30] tee:/boot# gnt-instance console --show-cmd jr.nachtbrand.ch
|
|
ssh -q -oEscapeChar=none -oHashKnownHosts=no -oGlobalKnownHostsFile=/var/lib/ganeti/known_hosts -oUserKnownHostsFile=/dev/null -oHostKeyAlias=ganeti.schottelius.org -oBatchMode=yes -oStrictHostKeyChecking=yes -t root@tee.schottelius.org '/usr/bin/socat STDIO,echo=0,icanon=0 UNIX-CONNECT:/var/run/ganeti/kvm-hypervisor/ctrl/jr.nachtbrand.ch.serial'
|
|
|
|
The problem is that the newly debootstrapped system
|
|
*does not have a serial console setup*.
|
|
|
|
As you can see, in the evening of this day I had a lot of new experiences,
|
|
but *no reliable running virtualisation framework*. That brings me to the
|
|
end of this report:
|
|
|
|
* User mode Linux does not work reliable under some I/O load.
|
|
* Virt-manager is absolutely not able to change the simplest parameters.
|
|
* Virsh is unusable, if you don't want to edit XML-files.
|
|
* Ganeti has a lot of unhandled problems and still relies very much on Xen + LVM.
|
|
|
|
As next Monday my vacation ends, I will have a look at the commercial virtualisation
|
|
frameworks. For the folks of the named FOSS stuff above: Guys, you've to improve
|
|
a lot, until one can call your software "good and clean software".
|
|
|
|
|
|
[[!tag unix vm]]
|