Re: [PATCH v1 3/3] mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL flag
From: Minchan Kim
Date: Wed Apr 29 2026 - 17:16:15 EST
On Wed, Apr 29, 2026 at 10:25:47AM +0200, Michal Hocko wrote:
> On Tue 28-04-26 15:37:57, Minchan Kim wrote:
> [...]
> > >From be4bd22a100ed6be2d1d2599ddb9da04043143eb Mon Sep 17 00:00:00 2001
> > From: Minchan Kim <minchan@xxxxxxxxxx>
> > Date: Fri, 24 Apr 2026 14:27:08 -0700
> > Subject: [PATCH] mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL
> > flag
> >
> > Currently, process_mrelease() requires userspace to send a SIGKILL signal
> > prior to invocation. This separation introduces a scheduling race window
> > where the victim task may receive the signal and enter the exit path
> > before the reaper can invoke process_mrelease().
> >
> > When the victim enters the exit path (do_exit -> exit_mm), it clears its
> > task->mm immediately. This causes process_mrelease() to fail with -ESRCH,
> > leaving the actual address space teardown (exit_mmap) to be deferred until
> > the mm's reference count drops to zero. In the field (e.g., Android),
> > arbitrary reference counts (reading /proc/<pid>/cmdline, or various other
> > remote VM accesses) frequently delay this teardown indefinitely,
> > defeating the purpose of expedited reclamation.
> >
> > In Android's LMKD scenarios, this delay keeps memory pressure high, forcing
> > the system to unnecessarily kill additional innocent background apps before
> > the memory from the first victim is recovered.
> >
> > This patch introduces the PROCESS_MRELEASE_REAP_KILL UAPI flag to support
> > an integrated auto-kill mode. When specified, process_mrelease() directly
> > injects a SIGKILL into the target task after finding its mm.
> >
> > To solve the race condition, we grab the mm reference via mmgrab() before
> > sending the SIGKILL. If the user passed PROCESS_MRELEASE_REAP_KILL, we assume
> > it will free its memory and proceed with reaping, making the logic as simple
> > as reap = reap_kill || task_will_free_mem(p).
> >
> > To handle shared address spaces safely in the auto-kill mode, we bail out
> > immediately if the mm is marked with MMF_MULTIPROCESS when
> > PROCESS_MRELEASE_REAP_KILL is specified. This protects existing users of
> > process_mrelease() from behavior changes while preventing unsafe reaping of
> > shared memory.
>
> Please explain why this is a different behavior from the global oom
> killer and how do you intend to deal with those mm shared process
> groups. I am not saying this is a wrong behavior but it will be hard to
> change once in place.
Sure.
>
> > Fundamentally, this allows process_mrelease() to trigger targeted memory
> > reclaim (via oom_reaper infrastructure) quickly, even if the victim is
> > not yet in the exit path, while reusing existing race handling between
> > reaper and exit_mmap.
> >
> > Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>
>
> Other than the above looks ok to me.
Thanks for the suggestion and reviews, Michal.