Re: [PATCH v2 5/7] memcg: pr_warn_once for unexpected events and stats

From: Johannes Weiner
Date: Sat Apr 27 2024 - 10:22:49 EST


On Fri, Apr 26, 2024 at 06:18:13PM -0700, Shakeel Butt wrote:
> On Fri, Apr 26, 2024 at 05:58:16PM -0700, Yosry Ahmed wrote:
> > On Fri, Apr 26, 2024 at 5:38 PM Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote:
> > >
> > > To reduce memory usage by the memcg events and stats, the kernel uses
> > > indirection table and only allocate stats and events which are being
> > > used by the memcg code. To make this more robust, let's add warnings
> > > where unexpected stats and events indexes are used.
> > >
> > > Signed-off-by: Shakeel Butt <shakeel.butt@xxxxxxxxx>
> > > ---
> > > mm/memcontrol.c | 43 ++++++++++++++++++++++++++++++++++---------
> > > 1 file changed, 34 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index 103e0e53e20a..36145089dcf5 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -671,9 +671,11 @@ unsigned long lruvec_page_state(struct lruvec *lruvec, enum node_stat_item idx)
> > > return node_page_state(lruvec_pgdat(lruvec), idx);
> > >
> > > i = memcg_stats_index(idx);
> > > - if (i >= 0) {
> > > + if (likely(i >= 0)) {
> > > pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
> > > x = READ_ONCE(pn->lruvec_stats->state[i]);
> > > + } else {
> > > + pr_warn_once("%s: stat item index: %d\n", __func__, idx);
> > > }
> >
> > Can we make these more compact by using WARN_ON_ONCE() instead:
> >
> > if (WARN_ON_ONCE(i < 0))
> > return 0;
> >
> > I guess the advantage of using pr_warn_once() is that we get to print
> > the exact stat index, but the stack trace from WARN_ON_ONCE() should
> > make it obvious in most cases AFAICT.

if (WARN_ONCE(i < 0, "stat item %d not in memcg_node_stat_items\n", i))
return 0;

should work?

> > No strong opinions either way.
>
> One reason I used pr_warn_once() over WARN_ON_ONCE() is the syzbot
> trigger. No need to trip the bot over this error condition.

The warn splat is definitely quite verbose. But I think that would
only be annoying initially, in case a site was missed. Down the line,
it seems helpful to have this stand out to somebody who is trying to
add a new cgroup stat and forgets to update the right enums.