Re: [PATCH RFC 3/4] mm: kmem: prepare remote memcg charging infra for interrupt contexts
From: Roman Gushchin
Date: Thu Aug 27 2020 - 18:37:43 EST
On Thu, Aug 27, 2020 at 02:58:50PM -0700, Shakeel Butt wrote:
> On Thu, Aug 27, 2020 at 10:52 AM Roman Gushchin <guro@xxxxxx> wrote:
> >
> > Remote memcg charging API uses current->active_memcg to store the
> > currently active memory cgroup, which overwrites the memory cgroup
> > of the current process. It works well for normal contexts, but doesn't
> > work for interrupt contexts: indeed, if an interrupt occurs during
> > the execution of a section with an active memcg set, all allocations
> > inside the interrupt will be charged to the active memcg set (given
> > that we'll enable accounting for allocations from an interrupt
> > context). But because the interrupt might have no relation to the
> > active memcg set outside, it's obviously wrong from the accounting
> > prospective.
> >
> > To resolve this problem, let's add a global percpu int_active_memcg
> > variable, which will be used to store an active memory cgroup which
> > will be sued from interrupt contexts. set_active_memcg() will
>
> *used
>
> > transparently use current->active_memcg or int_active_memcg depending
> > on the context.
> >
> > To make the read part simple and transparent for the caller, let's
> > introduce two new functions:
> > - struct mem_cgroup *active_memcg(void),
> > - struct mem_cgroup *get_active_memcg(void).
> >
> > They are returning the active memcg if it's set, hiding all
> > implementation details: where to get it depending on the current context.
> >
> > Signed-off-by: Roman Gushchin <guro@xxxxxx>
>
> I like this patch. Internally we have a similar patch which instead of
> per-cpu int_active_memcg have current->active_memcg_irq. Our use-case
> was radix tree node allocations where we use the root node's memcg to
> charge all the nodes of the tree and the reason behind was that we
> observed a lot of zombies which were stuck due to radix tree nodes
> charges while the actual pages pointed by the those nodes/entries were
> in used by active jobs (shared file system and the kernel is older
> than the kmem reparenting).
>
> Reviewed-by: Shakeel Butt <shakeelb@xxxxxxxxxx>
Thank you for reviews, Shakeel!
I'll fix the typo, add your acks and will resend it as v1.
Thanks!