We are currently finalizing the server consolidation in our department. The product we chose for virtualization is OpenVZ, because it sports creepy Russians.

All in all, it was a bit of a roller coaster ride, but once we figured out that most of the problems came from our own incompetence, we quickly stopped pointing fingers and shaking fists and instead read some documentation. Then all was good. We went from 12 servers to 5, killing 7 physical servers and saving roughly 1500W of power consumption. The new virtualization servers we used were actually the old database server and the old main web server, both overpowered. A change in the mentality and the technical competence level required from our students in the last few years has made the extra power for these boxes unneccessary. Now we’re using them much more efficiently because each of them runs several virtual servers.

Technicalities

We wasted a lot of time learned a lot by going the opposite route when it comes to OpenVZ configuration. Most people are advised to start with a BIG configuration for each virtual private server (VPS), we started with a tiny one. This meant that memory parameters were at a bare minimum, normally mimicking the specs of the hardware machine we were virtualizing. In the same go, we grouped services differently so that we could reduce the number of servers, again making better use of the available hardware. For servers that are created from scratch (not based on an existing physical machine), we also started from a minimal config file and went up from there.

This approach not only made me grow at least six new white hairs in my beard, but it also taught me about the importance of KMEMSIZE. KMEMSIZE is your friend. KMEMSIZE loves you. KMEMSIZE is soft and fluffy. Trust KMEMSIZE.

The problem with KMEMSIZE was that while we did assign enough memory in the main UBC memory categories (vmguarpages, oomguarpages, privvmpages), we didn’t have enough KMEMSIZE for our NUMPROCS. Just picture this! The poor NUMPROCS! The net result was that the server couldn’t fork new processes once its amount of KMEMSIZE was eaten up. So after multiplying our expected NUMPROCS with the estimated unswappable memory consumption per process, things worked perfectly.

Now our OpenVZ environment is very, very stable and very, very efficient. We’re very, very happy. This is a very, very success story.