Re: [PATCH] mm: memcontrol: asynchronous reclaim for memory.high

From: Daniel Jordan
Date: Thu Feb 20 2020 - 10:45:29 EST


+Peter

On Wed, Feb 19, 2020 at 05:08:59PM -0500, Johannes Weiner wrote:
> On Wed, Feb 19, 2020 at 04:41:12PM -0500, Daniel Jordan wrote:
> > On Wed, Feb 19, 2020 at 08:53:32PM +0100, Michal Hocko wrote:
> > > On Wed 19-02-20 14:16:18, Johannes Weiner wrote:
> > > > On Wed, Feb 19, 2020 at 07:37:31PM +0100, Michal Hocko wrote:
> > > > > On Wed 19-02-20 13:12:19, Johannes Weiner wrote:
> > > > > > This patch adds asynchronous reclaim to the memory.high cgroup limit
> > > > > > while keeping direct reclaim as a fallback. In our testing, this
> > > > > > eliminated all direct reclaim from the affected workload.
> > > > >
> > > > > Who is accounted for all the work? Unless I am missing something this
> > > > > just gets hidden in the system activity and that might hurt the
> > > > > isolation. I do see how moving the work to a different context is
> > > > > desirable but this work has to be accounted properly when it is going to
> > > > > become a normal mode of operation (rather than a rare exception like the
> > > > > existing irq context handling).
> > > >
> > > > Yes, the plan is to account it to the cgroup on whose behalf we're
> > > > doing the work.
> >
> > How are you planning to do that?
> >
> > I've been thinking about how to account a kernel thread's CPU usage to a cgroup
> > on and off while working on the parallelizing Michal mentions below. A few
> > approaches are described here:
> >
> > https://lore.kernel.org/linux-mm/20200212224731.kmss6o6agekkg3mw@xxxxxxxxxxxxxxxxxxxxxxxxxx/
>
> What we do for the IO controller is execute the work unthrottled but
> charge the cgroup on whose behalf we are executing with whatever cost
> or time or bandwith that was incurred. The cgroup will pay off this
> debt when it requests more of that resource.
>
[snip code pointers]

Thanks! Figuring out how the io controllers dealt with remote charging was on
my list, this makes it easier.

> The plan for the CPU controller is similar. When a remote execution
> begins, flush the current runtime accumulated (update_curr) and
> associate the current thread with another cgroup (similar to
> current->active_memcg); when remote execution is done, flush the
> runtime delta to that cgroup and unset the remote context.

Ok, consistency with io and memory is one advantage to doing it that way.
Creating kthreads in cgroups also seems viable so far, and it's unclear whether
either approach is significantly simpler or more maintainable than the other,
at least to me.

Is someone on your side working on remote charging right now? I was planning
to post an RFD comparing these soon and it would make sense to include them.