Re: [PATCH 0/3] Make workingset detection logic memcg aware

From: Kamezawa Hiroyuki
Date: Tue Aug 11 2015 - 12:00:44 EST


On 2015/08/10 17:14, Vladimir Davydov wrote:
On Sun, Aug 09, 2015 at 11:12:25PM +0900, Kamezawa Hiroyuki wrote:
On 2015/08/08 22:05, Vladimir Davydov wrote:
On Fri, Aug 07, 2015 at 10:38:16AM +0900, Kamezawa Hiroyuki wrote:
...
All ? hmm. It seems that mixture of record of global memory pressure and of local memory
pressure is just wrong.

What makes you think so? An example of misbehavior caused by this would
be nice to have.


By design, memcg's LRU aging logic is independent from global memory allocation/pressure.


Assume there are 4 containers(using much page-cache) with 1GB limit on 4GB server,
# contaienr A workingset=600M limit=1G (sleepy)
# contaienr B workingset=300M limit=1G (work often)
# container C workingset=500M limit=1G (work slowly)
# container D workingset=1.2G limit=1G (work hard)
container D can drive the zone's distance counter because of local memory reclaim.
If active/inactive = 1:1, container D page can be activated.
At kswapd(global reclaim) runs, all container's LRU will rotate.

Possibility of refault in A, B, C is reduced by conainer D's counter updates.

This does not necessarily mean we have to use different inactive_age
counter for global and local memory pressure. In your example, having
inactive_age per lruvec and using it for evictions on both global and
local memory pressure would work just fine.


you're right.



if (current memcg == recorded memcg && eviction distance is okay)
activate page.
else
inactivate
At page-out
if (global memory pressure)
record eviction id with using zone's counter.
else if (memcg local memory pressure)
record eviction id with memcg's counter.


I don't understand how this is supposed to work when a memory cgroup
experiences both local and global pressure simultaneously.


I think updating global distance counter by local reclaim may update counter too much.

But if the inactive_age counter was per lruvec, then we wouldn't need to
bother about it.

yes.

Anyway, what I understand now is that we need to reduce influence from a memcg's behavior
against other memcgs. Your way is dividing counter completely, my idea was implementing
different counter. Doing it by calculation will be good because we can't have enough record
space.


Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/