Re: [PATCH rebase] mm: fix vm-scalability regression in cgroup-aware workingset code

From: Michal Hocko
Date: Mon Jun 27 2016 - 09:05:52 EST


[Sorry for a late reply]

On Fri 24-06-16 13:51:01, Johannes Weiner wrote:
> This is a rebased version on top of mmots sans the nodelru stuff.
>
> ---
>
> 23047a96d7cf ("mm: workingset: per-cgroup cache thrash detection")
> added a page->mem_cgroup lookup to the cache eviction, refault, and
> activation paths, as well as locking to the activation path, and the
> vm-scalability tests showed a regression of -23%. While the test in
> question is an artificial worst-case scenario that doesn't occur in
> real workloads - reading two sparse files in parallel at full CPU
> speed just to hammer the LRU paths - there is still some optimizations
> that can be done in those paths.
>
> Inline the lookup functions to eliminate calls. Also, page->mem_cgroup
> doesn't need to be stabilized when counting an activation; we merely
> need to hold the RCU lock to prevent the memcg from being freed.
>
> This cuts down on overhead quite a bit:
>
> 23047a96d7cfcfca 063f6715e77a7be5770d6081fe
> ---------------- --------------------------
> %stddev %change %stddev
> \ | \
> 21621405 +- 0% +11.3% 24069657 +- 2% vm-scalability.throughput
>
> Reported-by: Ye Xiaolong <xiaolong.ye@xxxxxxxxx>
> Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>

Acked-by: Michal Hocko <mhocko@xxxxxxxx>

Minor note below

> +static inline struct mem_cgroup *page_memcg_rcu(struct page *page)
> +{

I guess rcu_read_lock_held() here would be appropriate

> + return READ_ONCE(page->mem_cgroup);
> +}
--
Michal Hocko
SUSE Labs