Re: [PATCH] mm/percpu, memcontrol: Per-memcg-lruvec percpu accounting

From: Michal Hocko

Date: Mon Mar 30 2026 - 10:28:20 EST


On Mon 30-03-26 07:10:10, Joshua Hahn wrote:
> On Mon, 30 Mar 2026 14:03:29 +0200 Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> > On Fri 27-03-26 12:19:35, Joshua Hahn wrote:
> > > Convert MEMCG_PERCPU_B from a memcg_stat_item to a memcg_node_stat_item
> > > to give visibility into per-node breakdowns for percpu allocations and
> > > turn it into NR_PERCPU_B.
> >
> > Why do we need/want this?
>
> Hello Michal,
>
> Thank you for reviewing my patch! I hope you are doing well.
>
> You're right, I could have done a better job of motivating the patch.
> My intent with this patch is to give some more visibility into where
> memory is physically, once you know which memcg it is in.

Please keep in mind that WHY is very often much more important than HOW
in the patch so you should always start with the intention and
justification.

> Percpu memory could probably be seen as "trivial" when it comes to figuring
> out what node it is on, but I'm hoping to make similar transitions to the
> rest of enum memcg_stat_item as well (you can see my work for the zswap
> stats in [1]).
>
> When all of the memory is moved from being tracked per-memcg to per-lruvec,
> then the final vision would be able to attribute node placement within
> each memcg, which can help with diagnosing things like asymmetric node
> pressure within a memcg, which is currently only partially accurate.
>
> Getting per-node breakdowns of percpu memory orthogonal to memcgs also
> seems like a win to me. While unlikely, I think that we can benefit from
> some amount of visibility into whether percpu allocations are happening
> equally across all CPUs.
>
> What do you think? Thank you again, I hope you have a great day!

I think that you should have started with this intended outcome first
rather than slicing it in pieces. Why do we want to shift to per-node
stats for other/all counters? What is the cost associated comparing to the
existing accounting (if any)? Please go into details on how do you plan
to use the data before we commit into a lot of code churn.

TBH I do not see any fundamental reasons why this would be impossible
but I am not really sure this is worth the work and I also do not see
potential subtle issues that we might stumble over when getting there.
So I would appreciate if you could have a look into that deeper and
provide us with evaluation on how do you want to achieve your end goal
and what can we expect on the way. It is, of course, impossible to see
all potential problems without starting implementing the thing but a
high level evaluation would be really helpful.

> Joshua
>
> [1] https://lore.kernel.org/all/20260311195153.4013476-1-joshua.hahnjy@xxxxxxxxx/

--
Michal Hocko
SUSE Labs