Re: [PATCH v2 3/3] mm: Fix missing mem cgroup soft limit tree updates

From: Michal Hocko
Date: Fri Feb 19 2021 - 04:18:04 EST


On Wed 17-02-21 12:41:36, Tim Chen wrote:
> On a per node basis, the mem cgroup soft limit tree on each node tracks
> how much a cgroup has exceeded its soft limit memory limit and sorts
> the cgroup by its excess usage. On page release, the trees are not
> updated right away, until we have gathered a batch of pages belonging to
> the same cgroup. This reduces the frequency of updating the soft limit tree
> and locking of the tree and associated cgroup.
>
> However, the batch of pages could contain pages from multiple nodes but
> only the soft limit tree from one node would get updated. Change the
> logic so that we update the tree in batch of pages, with each batch of
> pages all in the same mem cgroup and memory node. An update is issued for
> the batch of pages of a node collected till now whenever we encounter
> a page belonging to a different node. Note that this batching for
> the same node logic is only relevant for v1 cgroup that has a memory
> soft limit.

Let me paste the discussion related to this patch from other reply:
> >> For patch 3 regarding the uncharge_batch, it
> >> is more of an observation that we should uncharge in batch of same node
> >> and not prompted by actual workload.
> >> Thinking more about this, the worst that could happen
> >> is we could have some entries in the soft limit tree that overestimate
> >> the memory used. The worst that could happen is a soft page reclaim
> >> on that cgroup. The overhead from extra memcg event update could
> >> be more than a soft page reclaim pass. So let's drop patch 3
> >> for now.
> >
> > I would still prefer to handle that in the soft limit reclaim path and
> > check each memcg for the soft limit reclaim excess before the reclaim.
> >
>
> Something like this?
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 8bddee75f5cb..b50cae3b2a1a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3472,6 +3472,14 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
> if (!mz)
> break;
>
> + /*
> + * Soft limit tree is updated based on memcg events sampling.
> + * We could have missed some updates on page uncharge and
> + * the cgroup is below soft limit. Skip useless soft reclaim.
> + */
> + if (!soft_limit_excess(mz->memcg))
> + continue;
> +
> nr_scanned = 0;
> reclaimed = mem_cgroup_soft_reclaim(mz->memcg, pgdat,

Yes I meant something like this but then I have looked more closely and
this shouldn't be needed afterall. __mem_cgroup_largest_soft_limit_node
already does all the work
if (!soft_limit_excess(mz->memcg) ||
!css_tryget(&mz->memcg->css))
goto retry;
so this shouldn't really happen.
--
Michal Hocko
SUSE Labs