Re: [RFC/T/D][PATCH 2/2] Linux/Guest cooperative unmapped page cachecontrol

From: Avi Kivity
Date: Wed Jun 16 2010 - 07:39:14 EST


On 06/15/2010 05:47 PM, Dave Hansen wrote:

That's a bug that needs to be fixed. Eventually the host will come
under pressure and will balloon the guest. If that kills the guest, the
ballooning is not effective as a host memory management technique.
I'm not convinced that it's just a bug that can be fixed. Consider a
case where a host sees a guest with 100MB of free memory at the exact
moment that a database app sees that memory. The host tries to balloon
that memory away at the same time that the app goes and allocates it.
That can certainly lead to an OOM very quickly, even for very small
amounts of memory (much less than 100MB). Where's the bug?

I think the issues are really fundamental to ballooning.

There are two issues involved.

One is, can the kernel accurately determine the amount of memory it needs to work? We have resources such as RAM and swap. We have liabilities in the form of swappable userspace memory, mlocked userspace memory, kernel memory to support these, and various reclaimable and non-reclaimable kernel caches. Can we determine the minimum amount of RAM to support are workload at a point in time?

If we had this, we could modify the balloon to refuse to balloon if it takes the kernel beneath the minimum amount of RAM needed.

In fact, this is similar to allocating memory with overcommit_memory = 0. The difference is the balloon allocates mlocked memory, while normal allocations can be charged against swap. But fundamentally it's the same.

If all the guests do this, then it leaves that much more free memory on
the host, which can be used flexibly for extra host page cache, new
guests, etc...
If the host detects lots of pagecache misses it can balloon guests
down. If pagecache is quiet, why change anything?
Page cache misses alone are not really sufficient. This is the classic
problem where we try to differentiate streaming I/O (which we can't
effectively cache) from I/O which can be effectively cached.

True. Random I/O across a very large dataset is also difficult to cache.

If the host wants to start new guests, it can balloon guests down. If
no new guests are wanted, why change anything?
We're talking about an environment which we're always trying to
optimize. Imagine that we're always trying to consolidate guests on to
smaller numbers of hosts. We're effectively in a state where we
_always_ want new guests.

If this came at no cost to the guests, you'd be right. But at some point guest performance will be hit by this, so the advantage gained from freeing memory will be balanced by the disadvantage.

Also, memory is not the only resource. At some point you become cpu bound; at that point freeing memory doesn't help and in fact may increase your cpu load.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/