All we need is to select the right page to drop.Do we need to drop to the granularity of the page to drop? I think
figuring out the class of pages and making sure that we don't write
our own reclaim logic, but work with what we have to identify the
class of pages is a good start.
How can the host tell if there is duplication? It may know it hasWell it is possible in host user space, I for example use memory
some pagecache, but it has no idea whether or to what extent guest
pagecache duplicates host pagecache.
cgroup and through the stats I have a good idea of how much is duplicated.
I am ofcourse making an assumption with my setup of the cached mode,
that the data in the guest page cache and page cache in the cgroup
will be duplicated to a large extent. I did some trivial experiments
like drop the data from the guest and look at the cost of bringing it
in and dropping the data from both guest and host and look at the
cost. I could see a difference.
Unfortunately, I did not save the data, so I'll need to redo the
experiment.
It doesn't, really. The host only has aggregate information aboutOn the exact pages to drop, please see my comments above on the class
itself, and no information about the guest.
Dropping duplicate pages would be good if we could identify them.
Even then, it's better to drop the page from the host, not the
guest, unless we know the same page is cached by multiple guests.
of pages to drop.
There are reasons for wanting to get the host to cache the data
Unless the guest is using cache = none, the data will still hit the
host page cache
The host can do a better job of optimizing the writeouts
But why would the guest voluntarily drop the cache? If there is noSo, there are basically two approaches
memory pressure, dropping caches increases cpu overhead and latency
even if the data is still cached on the host.
1. First patch, proactive - enabled by a boot option
2. When ballooned, we try to (please NOTE try to) reclaim cached pages
first. Failing which, we go after regular pages in the alloc_page()
call in the balloon driver.
That is why I've presented data on the experiments I've run and2. Drop the cache on either a special balloon option, again the hostDropping in response to pressure is good. I'm just not convinced
knows it caches that very same information, so it prefers to free that
up first.
the patch helps in selecting the correct page to drop.
provided more arguments to backup the approach.