[vbox-dev] Docs on how page fusion works?

Ramshankar ramshankar.venkataraman at oracle.com
Wed May 2 09:25:07 GMT 2012

On 05/ 2/12 08:11 AM, Howard Su wrote:
> On Wed, Apr 18, 2012 at 2:39 AM, Ramshankar
> <ramshankar.venkataraman at oracle.com
> <mailto:ramshankar.venkataraman at oracle.com>> wrote:
>     Our current page fusion logic involves knowledge from within the guest
>     as to what can be fused. Instead of pegging through the entire guest
>     memory using some-sort of daemon/service and maintaining hashes and
>     last-touch times of pages and comparing them, our guest additions
>     (currently page fusion implemented only for Windows guests) gives hints
>     about which pages are most likely candidates for fusion.
>     This saves a lot of time than sweeping the memory but it also means we
>     will not be squeezing out every last bit. We made this trade-off
>     decision because we felt this is a good approach for the fulfilling the
>     objective.
> I read a project http://code.google.com/p/uksm which is improving KSM's
> scan speed and result. It is called UKSM.

Definitely an interesting area of work.

> in a type of hardware, Interl Core 2 9300, KSM scans page at 260M/s.
> UKSM can hit 477MB/s - 923M/s, even 627- 2446MB/s in the pages which
> doesn't contain duplications. I think the speed is a big factor here to
> do the tradeoff. full memory scan can avoid the duplication of anony
> memory pages. will this change your decision about the tradeoff?

It's not such an easy decision, this sort of full scan highly depends on 
the guest and its memory access patterns. Scanning for active candidates 
will take up CPU time on the host. Either way there is computing 
involved somewhere, it's not for free. Since these are typically large 
servers with loads of memory and CPU power, we don't yet find a pressing 
need to change our current logic. Also, squeezing out every last bit of 
shared pages from running VMs only matters if the host's memory 
consumption is very high in order to desperately make room for more VMs 
(we don't return memory back to the host for other applications). In 
such a case the performance is going to suffer anyway.

We're not saying our solution is the perfect one, there's room for 
improvement. We could for instance take a hybrid approach where the 
guest additions can provide information of RAM ranges after accounting 
for well-known fusable pages based on the guest, but as of now, I'm not 
aware of any immediate plans to revamp/rewrite our existing solution.

>     A daemon on the guest runs which locates common system files/dlls/ro
>     kernel memory etc. paged-in on the guest and reports the physical pages
>     that can be deduplicated. We don't scan the guest memory actively
>     looking for fusion candidates. If the guest touches the pages for write
>     access that'll be marked as no longer a candidate. Because of
>     contextually knowledge from within the guest, VirtualBox's page fusion
>     identifies only long term fusion candidates that are very unlikely to be
>     touched often.
>     That's just the broad overview.
>     Regards,
>     Ram
>     On 04/16/12 05:23 PM, Alexey Eromenko wrote:
>      >>
>      >> What kind if obstacles would I face if I tried to implemented the
>      >> same behavior (Scan processes) for Linux guests? I plan on
>     scanning every
>      >> process then check the memory maps from /proc/<pid>/maps. If the
>     permissions
>      >> are set to r only or rx then I'll register the pages with the
>     host. This
>      >> wouldn't cover the process it self, but a major portion of
>      >> the wasted memory.
>      >>
>      >> Sounds simple (everything does these days) and I plan to work
>     it. However, I
>      >> just need an expert to give me the go-ahead since this is all
>     new to me.
>      >
>      > I think before undertaking such a massive effort, it pays off to
>      > compare existing (Open-Source) technologies: Linux KSM vs. VBox
>      > PageFusion.
>      >
>      > Why ?
>      > KSM *avoids* the need of developing guest-side drivers altogether.
>      > With KSM all mem dedup logic is done host-side-only, so all
>     legacy and
>      > future OSes work out-of-the-box.
>      > VBox PageFusion requires GuestAdditions, which means developing and
>      > testing (!) drivers for lots of guest OSes and OS versions.
>      > Developing KSM-equivalent for VBox may pay off better than extending
>      > VBox PageFusion to Linux guests (this will require writing new Linux
>      > kernel drivers).
>      > KSM itself is Linux-host-only, so cannot be used directly. (While
>     VBox
>      > PageFusion is Windows-guest-only ATM)
>      > KSM-like system will require only host-side development and testing.
>      >
>      > What needs to be considered ? (KSM-like approach vs. VBox
>     PageFusion approach)
>      > 1. performance - how much CPU usage does it takes ?
>      > 2. speed convergence [related to 1.] - how much time does it take to
>      > find 1 GiB of RAM and dedup it ?
>      > 3. efficiency - how many pages were actually shared ?
>      > 4. any other advantages/disadvantages of both approaches.
>      >
>      > Disclaimer: I have NOT tested either solution. Just my 2 cents.
>     _______________________________________________
>     vbox-dev mailing list
>     vbox-dev at virtualbox.org <mailto:vbox-dev at virtualbox.org>
>     https://www.virtualbox.org/mailman/listinfo/vbox-dev
> --
> -Howard

More information about the vbox-dev mailing list