Re: [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching
From: Yafang Shao
Date: Mon Mar 02 2026 - 09:39:35 EST
On Mon, Mar 2, 2026 at 5:48 PM Kairui Song <ryncsn@xxxxxxxxx> wrote:
>
> On Mon, Mar 2, 2026 at 5:20 PM Barry Song <21cnbao@xxxxxxxxx> wrote:
> >
> > On Mon, Mar 2, 2026 at 4:25 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
> > >
> > > The challenge we're currently facing is that we don't yet know which
> > > workloads would benefit from it ;)
> > > We do want to enable mglru on our production servers, but first we
> > > need to address the risk of OOM during the switch—that's exactly why
> > > we're proposing this patch.
> >
> > Nobody objects to your intention to fix it. I’m curious: to what
> > extent do we want to fix it? Do we aim to merely reduce the probability
> > of OOM and other mistakes, or do we want a complete fix that makes
> > the dynamic on/off fully safe?
>
> Yeah, I'm glad that more people are trying MGLRU and improving it.
>
> We also have an downstream fix for the OOM on switch issue, but that's
> mostly as a fallback in case MGLRU doesn't work well, our goal is
> still try to enable MGLRU as much as possible,
Our goals are aligned.
Before enabling mglru, we must first ensure it won't cause OOM errors
across multiple servers. We propose fixing this because, during our
previous mglru enablement, many instances of a single service OOM'd
simultaneously—potentially leading to data loss for that service.
> many issues have been
> identified and I'm willing to push and fix things upstream together.
>
> I didn't consider the the OOM on switch an upstream issue though.
This is a serious upstream kernel bug that could lead to data loss. If
it is not recognized as such, the upstream kernel should consider
removing this dynamic toggle.
> But
> to fix that we just used a schedule_timeout when seeing the lru status
So your proposal is essentially something like this?
while (status) {
schedule_timeout(random_timeout);
}
> is different from the global status, very close to what Barry
> suggested, with some other tweaks.
>
> Keep doing the reclaim during the switch did result in some unexpected
> behaviors, including OOM still occurring, just much more unlikely than
> before. Like a typical TOCTOU problem for checking the lru's status.
>
> Let me Cc BIngfang, maybe he can provide more detail.
Looking forward to your solution.
--
Regards
Yafang