Re: [RFC/T/D][PATCH 2/2] Linux/Guest cooperative unmapped page cachecontrol
From: Avi Kivity
Date: Tue Jun 15 2010 - 05:54:40 EST
On 06/15/2010 10:52 AM, Balbir Singh wrote:
That is why the policy (in the next set) will come from the host. As
to whether the data is truly duplicated, my experiments show up to 60%
of the page cache is duplicated.
Isn't that incredibly workload dependent?
We can't expect the host admin to know whether duplication will
occur or not.
I was referring to cache = (policy) we use based on the setup. I don't
think the duplication is too workload specific. Moreover, we could use
aggressive policies and restrict page cache usage or do it selectively
on ballooning. We could also add other options to make the ballooning
option truly optional, so that the system management software decides.
Consider a read-only workload that exactly fits in guest cache. Without
trimming, the guest will keep hitting its own cache, and the host will
see no access to the cache at all. So the host (assuming it is under
even low pressure) will evict those pages, and the guest will happily
use its own cache. If we start to trim, the guest will have to go to
disk. That's the best case.
Now for the worst case. A random access workload that misses the cache
on both guest and host. Now every page is duplicated, and trimming
guest pages allows the host to increase its cache, and potentially
reduce misses. In this case trimming duplicated pages works.
Real life will see a mix of this. Often used pages won't be duplicated,
and less often used pages may see some duplication, especially if the
host cache portion dedicated to the guest is bigger than the guest cache.
I can see that trimming duplicate pages helps, but (a) I'd like to be
sure they are duplicates and (b) often trimming them from the host is
better than trimming them from the guest.
Trimming from the guest is worthwhile if the pages are not used very
often (but enough that caching them in the host is worth it) and if the
host cache can serve more than one guest. If we can identify those
pages, we don't risk degrading best-case workloads (as defined above).
(note ksm to some extent identifies those pages, though it is a bit
expensive, and doesn't share with the host pagecache).
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/