Re: [Resend PATCH 1/6] mm/memcg: warning on !memcg after readahead page charged

From: Hugh Dickins
Date: Thu Aug 27 2020 - 02:01:28 EST


On Mon, 24 Aug 2020, Hugh Dickins wrote:
> On Tue, 25 Aug 2020, Alex Shi wrote:
> > reproduce using our linux-mm random bug collection on NUMA systems.
> > >>
> > >> OK, I must have missed that this was on ppc. The order makes more sense
> > >> now. I will have a look at this next week.
> > >
> > > OK, so I've had a look and I know what's going on there. The
> > > move_pages12 is migrating hugetlb pages. Those are not charged to any
> > > memcg. We have completely missed this case. There are two ways going
> > > around that. Drop the warning and update the comment so that we do not
> > > forget about that or special case hugetlb pages.
> > >
> > > I think the first option is better.
> > >
> >
> >
> > Hi Michal,
> >
> > Compare to ignore the warning which is designed to give, seems addressing
> > the hugetlb out of charge issue is a better solution, otherwise the memcg
> > memory usage is out of control on hugetlb, is that right?

I agree: it seems that hugetlb is not participating in memcg and lrus,
so it should not even be calling mem_cgroup_migrate(). That happens
because hugetlb finds the rest of migrate_page_states() useful,
but maybe there just needs to be an "if (!PageHuge(page))" or
"if (!PageHuge(newpage))" before its call to mem_cgroup_migrate() -
but I have not yet checked whether either of those actually works.

The same could be done inside mem_cgroup_migrate() instead,
but it just seems wrong for hugetlb to be getting that far,
if it has no other reason to enter mm/memcontrol.c.

>
> Please don't suppose that this is peculiar to hugetlb: I'm not
> testing hugetlb at all (sorry), but I see the VM_WARN_ON_ONCE from
> mem_cgroup_page_lruvec(), and from mem_cgroup_migrate(), and from
> mem_cgroup_swapout().
>
> In all cases seen on a PageAnon page (well, in one case PageKsm).
> And not related to THP either: seen also on machine incapable of THP.
>
> Maybe there's an independent change in 5.9-rc that's defeating
> expectations here, or maybe they were never valid. Worth
> investigating, even though the patch is currently removed,
> to find out why expectations were wrong.

It was very well worth investigating. And at the time of writing
the above, I thought it was coming up very quickly on all machines,
but in fact it only came up quickly on the one exercising KSM;
on the other machines it took about an hour to appear, so no
wonder that you and others had not already seen it.

While I'd prefer to spring the answer on you all in the patch that
fixes it, there's something more there that I don't fully understand
yet, and want to sort out before posting; so I'd better not keep you
in suspense... we broke the memcg charging of ksm_might_need_to_copy()
pages a couple of releases ago, and not noticed until your warning.

What's surprising is that the same bug can affect PageAnon pages too,
even when there's been no KSM involved whatsoever. I put in the KSM
fix, set all the machines running, expecting to get more info on the
PageAnon instances, but all of them turned out to be fixed.

>
> You'll ask me for more info, stacktraces etc, and I'll say sorry,
> no time today. Please try the swapping tests I sent before.
>
> And may I say, the comment
> /* Readahead page is charged too, to see if other page uncharged */
> is nonsensical to me, and much better deleted: maybe it would make
> some sense if the reader could see the comment it replaces - as
> they can in the patch - but not in the resulting source file.

I stand by that remark; but otherwise, I think this was a helpful
commit that helped to identify a bug, just as it was intended to do.
(I say "helped to" because its warnings alerted, but did not point
to the culprit: I had to add another in lru_cache_add() to find it.)

Hugh