Sunday, 2 August 2015

EPT and KSM for High Throughput Computing

As part of the analysis of CERN's compute intensive workload in a virtualised infrastructure, we have been examining various settings of KVM to tune the performance.

EPT

EPT is an Intel technology which provides hardware assist for virtualisation and is one of the options as part of the Intel KVM driver in Linux. This is turned on by default but can be controlled using the options on the KVM driver. The driver can then be reloaded as long as no qemu-kvm processes are running.

# cat /etc/modprobe.d/kvm_intel.conf
options kvm_intel ept=0
# modprobe -r kvm_intel
# modprobe kvm_intel

In past studies, EPT has had a negative performance impact on High Energy Physics applications.  With recent changes in processor architecture, this was re-tested as follows.


This is a 6% performance improvement with EPT off. This seems surprising as the functions are intended to improve virtualisation performance rather than reduce it.

The CERN configuration uses hypervisors running CentOS 7 and guests running Scientific Linux CERN 6. With this configuration, EPT can be turned off without problems but a recent test with CentOS 7 guests has shown that this functionality has an issue which has been reported upstream. Only one CPU is recognised and the rest are reported as being unresponsive.

KSM

Kernel same-page merging is a technology which finds common memory pages inside a linux system and merges the pages so there is only a single copy, saving memory resources. In the event of one of the copies being updated, a new copy is created so the function is transparent to the processes on the system.

For hypervisors, this can be very beneficial where multiple guests are running with the same level of operating system. However, there is an overhead due to the scanning process which may cause the applications to run more slowly. 

We benchmarked 4 VMs, each 8 cores, running the same operating system levels. The results were that KSM causes an overhead of around 1%.



To turn KSM off, the ksmtuned daemon should be stopped.

systemctl disable ksmtuned

The ksmd kernel thread still seems to run but does not use any CPU resources. Following the change, it is important to verify that there is still sufficient memory on the hypervisor since not merging the pages could cause an increase in memory usage and lead to swapping (which is a very significant performance impact)

This work was in collaboration with Sean Crosby (University of Melbourne) and Arne Wiebalck and Ulrich Schwickerath  (CERN).

Previous blogs in this series are

References

  • Intel article on EPT - https://01.org/blogs/tlcounts/2014/virtualization-advances-performance-efficiency-and-data-protection
  • Previous studies with KVM and HEP code - https://indico.cern.ch/event/35523/session/28/contribution/246/attachments/705127/968004/HEP_Specific_Benchmarks_of_Virtual_Machines_on_multi-core_CPU_Architectures.pdf
  • VMWare paper at https://www.vmware.com/pdf/Perf_ESX_Intel-EPT-eval.pdf