Oberon (Ganeti cluster master) failure

Print Friendly, PDF & Email

Due to a regression in memory allocation to the Xen control domain (dom0) on Oberon and because of an exhaustive ganeti query (on my part) triggering the rebuild of a number DRBD disks there was a fatal out of memory problem which caused Oberon to lock up. This in turn affected the Virtual Machines running on there. This included my-p0 and a number of not so important dev servers. I forced an immediate power reset to recover the situation.

All VMs have restored now and I will look to migrate the Ganeti cluster master to it’s own physical server. I will also fix the memory regression.