- Info
Properties of a Xen-AoE cluster
The defining qualities...
Xen-AoE Properties
Aside from utilizing Xen and AoE in general there are a few guiding principles used in the design
of a Xen-AoE cluster which makes the concept work so well:
Diskless CPU nodes - This ensures that your data/virtual machine
is not trapped inside any one physical CPU unit. If any particular CPU
dies you can instantly restart the virtual machine on a different
CPU. It also means there are no disks to fail and bring down the
machine, no disks to add more heat to the chassis, and no disks to
block airflow to the CPU's. The CPU nodes themselves are PXE
booted. So you only have to change one disk image on the network to
change how all of the systems boot. Then the virtual machines can be
network booted or handed block devices from the Domain0. If PXE boot is too much complication for you (and it isn't all that simple) booting from an inexpensive flash disk is another very good way to go.
Gigabit ethernet - Ethernet has gotten fast enough and cheap
enough that it has become feasible to essentially replace the
traditional IDE cable in our servers with a piece of ethernet. This is
great as it provides us the ability to combine lots of disk using a
simple, fast, cheap network technology and make this disk available to
many CPU's on a network.
SATA disks - Although you could use SCSI the price/performance
difference these days suggests that it makes more sense to simply buy
more SATA disks and get more spindles than to spend a bunch of money
on SCSI disks and controllers. A recent study
has shown that there isn't even as much difference, if any at all, in
reliability between SATA and SCSI or FC disks as we may have thought. In addition to the cheap bulk storage of SATA you can implement multiple price/performance tiers of storage using AoE to share everything up to the latest high IOPS solid state disk storage with your virtual machines.
RAID - This is an important part of ensuring that the cluster
survives disk failures. With a Xen-AoE cluster you can use RAID 5
inside a disk node exporting block devices via AoE to prevent a single
disk failure from causing data loss and then you can even RAID 1 two
disk nodes together to ensure that even if a whole disk node failed
(or the switch port it is connected to) you do not have any
interruption in service. Plus this gets you lots of spindles for
random accesses which is exactly what you need for performance as it
is nearly always disk head seek time which slows things down.
Zero single points of failure - If the CPU/RAM/motherboard/power
supply dies, you restart the domain on a different CPU. If a disk dies
RAID 1/5/6 covers it up. If a disk node dies RAID 1 gets it. If a switch
dies, only the equipment on that switch goes down but the rest of the
equipment stays up. Make sure you have each RAID 1 disk node pair
connected to different switches. Make sure you divide up your CPU
nodes too.
Commodity hardware - The mobos, cases, cpus, switches, everything here is
commodity. You can get spare parts in a matter of hours and swap them out
yourselves with little more skill than the guy down at the corner mom and pop
PC hardware store has. This keeps price down and maintainability up.