KSM++: Using I/O-based hints to make memory-deduplication scanners more efficient

Miller, Konrad; Franz, Fabian; Groeninger, Thorsten; Rittinghaus, Marc; Hillenbrand, Marius; Bellosa, Frank


Memory scanning deduplication techniques, as implemented in Linux' Kernel Samepage Merging (KSM), work very well for deduplicating fairly static, anonymous pages with equal content across different virtual machines. However, scanners need very aggressive scan rates when it comes to identifying sharing opportunities with a short life span of up to about 5 min. Otherwise, the scan process is not fast enough to catch those short-lived pages.
Our approach generates I/O-based hints in the host to make the memory scanning process more efficient, thus enabling it to find and exploit short-lived sharing opportunities without raising the scan rate. Experiences with similar techniques for paravirtualized guests have shown that pages in a guest’s unified buffer cache are good sharing candidates. We already identify such pages in the host when carrying out I/O-operations on behalf of the guest. The target/source pages in the guest can safely be assumed to be part of the guest’s unified buffer cache. That way, we can determine good sharing hints for the memory scanner. A modification of the guest is not required.
We have implemented our approach in Linux.

