Re: [RFC 0/3] mm: process_mrelease: expedited reclaim and auto-kill support

From: Minchan Kim

Date: Thu Apr 23 2026 - 20:08:49 EST


On Thu, Apr 23, 2026 at 03:36:57PM -0700, Suren Baghdasaryan wrote:
> On Thu, Apr 23, 2026 at 2:50 AM David Hildenbrand (Arm)
> <david@xxxxxxxxxx> wrote:
> >
> > On 4/23/26 09:50, Michal Hocko wrote:
> > > On Mon 20-04-26 14:53:23, Minchan Kim wrote:
> > >> On Fri, Apr 17, 2026 at 09:11:21AM +0200, Michal Hocko wrote:
> > > [...]
> > >>> Yes. All which make sense, really. I am still not convinced about the
> > >>> clean page cache because that just seems like a hack to workaround wrong
> > >>> userspace oom heuristics.
> > >>
> > >> I see it a bit differently. When paltform decides to kill a process
> > >> to free up memory, they want that memory back right away.
> > >>
> > >> So it doesn't make much sense for the kernel to ignore that and leave the clean
> > >> file pages to be picked up slowly by kswapd later.
> > >>
> > >> In some aspects, you can think of LMKD as a more specialized, userspace version
> > >> of kswapd. It has high-level knowledge of process priorities and knows exactly
> > >> which process is safe to kill to get memory instantly. The kernel's kswapd,
> > >> however, operates globally without this specific process-level awareness, which
> > >> makes it less suited for this kind of targeted reclamation.
> > >>
> > >> If we force LMKD to rely on the slower global kswapd to actually free the clean
> > >> pages, it defeats the whole purpose of targeting a specific process.
> > >>
> > >> So letting process_mrelease speed this up isn't a hack at all. It's just helping
> > >> the kernel do what the admin wanted in the first place: fast, targeted memory.
> > >
> > > This is a very creative/disruptive way to do a memory reclaim. From a
> > > user POV I would much rather see clean page cache reclaimed before my
> > > apps start to disappear. But this is obviously your call and your users
> > > that will care.
> > >
> > > Anyway, I still maintain my position. I do not think it is a good
> > > idea to drop clean page cache as you do not know whether there are other
> > > users.
>
> I'm very much familiar with these issues in Android and really want to
> find a good solution for them. IIUC, this RFC tries to address 2
> things at once:
> 1. handling clean private page cache when reaping memory of a kill victim;
> 2. addressing a race between kill() and process_release() when
> process_release() can't happen before the kill() but if it happens too
> late after the victim passed its exit_mm() then process_release()
> fails to find the mm to reap. This defeats the purpose of
> process_release() call because the actual memory (released by
> exit_mmap()) might not yet be free and a successful process_release()
> would be very beneficial.
>
> I see these two as separate issues and I'm not sure combining them
> into a single discussion is a good idea.

Yeah, they are two different issues so I tried to show those problems
in cover-letter and address each issues one by one from each patch.

I can easily drop either of them if it's not received well.
I am fine to send them separately, too if that's confused. No problem.

>
> >
> > IIRC, Johannes raised in the past the we cannot predict the future.
> >
> > For example, if an app gets OOM-killed, wouldn't we usually try restarting it,
> > re-consuming the clean pagecache pages we would be evicting here?
>
> Sure, we can't predict which app the user will use next, so when
> killing we usually kill the least recently used one. That's a
> reasonable strategy in most cases.
> In general, if speeding up the victim's reclaim negatively affects the
> overall user workflow then this would mean we are selecting wrong kill
> targets. In that case, we would need to adjust the target selection
> strategy.
>
> Thanks for tackling this Minchan! I'll try to review the patches this
> weekend and provide my feedback.

Please go with second patchset.
https://lore.kernel.org/linux-mm/20260421230239.172582-1-minchan@xxxxxxxxxx/

Thanks!