Edward Saksehånd asked:
I have a server (single socket Nehalem w/ 24GB RAM) acting primarily as a KVM host containing a bunch of Windows Servers and a few (tickless) Linux instances.
I usually compile my desktop kernels preemptible with 1000 Hz tickless timer, using the BFS CPU scheduler (via the CK patchset) and BFQ disk scheduler.
On servers, I do it all vanilla with 100 Hz non-tickless using CFQ and no forced preemption.
However, I don’t have the time or the skills to do benchmarking on this, so I’m looking for some input on the optimal settings for a KVM kernel. Would the throughput of the virtual machines benefit from a 1000 Hz kernel?
And, would it be a bad idea to use the BFS scheduler? I’ve heard that it may bring benefits to single CPU servers as well.
I’m also thinking about using the BFQ disk scheduler with the low_latency option disabled.
Can anyone point me in the right direction here? I’m kind of a newbie when it comes to low-level system stuff. 🙂
To begin with, the “canonical” reference for KVM hypervisor tuning is still IBM’s excellent Best Practices for KVM which I suggest you go through point-by-point.
Some things you will almost certainly want to do, after carefully testing with your intended workload:
Use virtio drivers in your Windows guests. You should already be doing this; if you aren’t, this will give you a very noticeable speedup. Linux guests should automatically use virtio from installation, though if you are virtualizing very old Linux systems, double check them.
Dump BFS. It was designed for low-latency desktop loads on low-end hardware and its author admits that it will “not scale to massive hardware”. Doesn’t inspire confidence.
Dump BFQ/CFQ. Virtually everyone gets the highest performance with the deadline I/O scheduler, and while you should test, you likely won’t be an exception.
Make sure kernel samepage merging is running, and tune it appropriately. This can significantly reduce memory requirements on your hypervisor, especially when multiple guests run the same OS.
When using local storage, use raw block devices, such as LVM logical volumes, rather than image files. This removes a layer of abstraction from disk I/O.
There are many other things covered in IBM’s guide I referred to earlier, but these should give you the most bang for your buck.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.