On Wed, Jun 22, 2011 at 11:39 PM, Rik van Riel<riel@xxxxxxxxxx> wrote:On 06/22/2011 07:19 AM, Izik Eidus wrote:
So what we say here is: it is better to have little junk in the unstable
tree that get flushed eventualy anyway, instead of make the guest
slower....
this race is something that does not reflect accurate of ksm anyway due
to the full memcmp that we will eventualy perform...
With 2MB pages, I am not convinced they will get "flushed eventually",
because there is a good chance at least one of the 4kB pages inside
a 2MB page is in active use at all times.
I worry that the proposed changes may end up effectively preventing
KSM from scanning inside 2MB pages, when even one 4kB page inside
is in active use. This could mean increased swapping on systems
that run low on memory, which can be a much larger performance penalty
than ksmd CPU use.
We need to scan inside 2MB pages when memory runs low, regardless
of the accessed or dirty bits.
I agree on this point. Dirty bit , young bit, is by no means accurate. Even
on 4kB pages, there is always a chance that the pte are dirty but the contents
are actually the same. Yeah, the whole optimization contains trade-offs and
trades-offs always have the possibilities to annoy someone. Just like
page-bit-relying LRU approximations none of them is perfect too. But I think
it can benefit some people. So maybe we could just provide a generic balanced
solution but provide fine tuning interfaces to make sure tha when it really gets
in the way of someone, he has a way to walk around.
Do you agree on my argument? :-)