Re: WARNINGs in set_task_reclaim_state with memory cgroup and full memory usage

From: Michal Hocko
Date: Mon Aug 26 2019 - 06:55:26 EST

On Fri 23-08-19 18:03:01, Yang Shi wrote:
> On 8/23/19 3:00 PM, Adric Blake wrote:
> > Synopsis:
> > A WARN_ON_ONCE is hit twice in set_task_reclaim_state under the
> > following conditions:
> > - a memory cgroup has been created and a task assigned it it
> > - memory.limit_in_bytes has been set
> > - memory has filled up, likely from cache
> >
> > In my usage, I create a cgroup under the current session scope and
> > assign a task to it. I then set memory.limit_in_bytes and
> > memory.soft_limit_in_bytes for the cgroup to reasonable values, say
> > 1G/512M. The program accesses large files frequently and gradually
> > fills memory with the page cache. The warnings appears when the
> > entirety of the system memory is filled, presumably from other
> > programs.
> >
> > If I wait until the program has filled the entirety of system memory
> > with cache and then assign a memory limit, the warnings appear
> > immediately.
> It looks the warning is triggered because kswapd set reclaim_state then the
> memcg soft limit reclaim in the same kswapd set it again.

Yes, this is indeed the case. The same seems possible from the direct
reclaim AFAICS.

> But, kswapd and memcg soft limit uses different reclaim_state from different
> scan control. It sounds not correct, they should use the same reclaim_state
> if they come from the same context if my understanding is correct.

I haven't checked very closely and I might be wrong but setting the
reclaim state from the mem_cgroup_shrink_node doesn't make any sense in
the current code. The soft limit is always called from the global
reclaim and both kswapd and the direct reclaim already track reclaim
state correctly. We just haven't noticed until now beause the warning is
quite recent and mostly likely only few people tend to use soft limit
these days.

That being said, we should simply do this instead: