Re: [PATCH v1 3/3] mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL flag

From: Minchan Kim

Date: Wed Apr 29 2026 - 17:17:50 EST


On Wed, Apr 29, 2026 at 01:01:21PM -0700, Suren Baghdasaryan wrote:
> On Wed, Apr 29, 2026 at 1:25 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> >
> > On Tue 28-04-26 15:37:57, Minchan Kim wrote:
> > [...]
> > > >From be4bd22a100ed6be2d1d2599ddb9da04043143eb Mon Sep 17 00:00:00 2001
> > > From: Minchan Kim <minchan@xxxxxxxxxx>
> > > Date: Fri, 24 Apr 2026 14:27:08 -0700
> > > Subject: [PATCH] mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL
> > > flag
> > >
> > > Currently, process_mrelease() requires userspace to send a SIGKILL signal
> > > prior to invocation. This separation introduces a scheduling race window
> > > where the victim task may receive the signal and enter the exit path
> > > before the reaper can invoke process_mrelease().
> > >
> > > When the victim enters the exit path (do_exit -> exit_mm), it clears its
> > > task->mm immediately. This causes process_mrelease() to fail with -ESRCH,
> > > leaving the actual address space teardown (exit_mmap) to be deferred until
> > > the mm's reference count drops to zero. In the field (e.g., Android),
> > > arbitrary reference counts (reading /proc/<pid>/cmdline, or various other
> > > remote VM accesses) frequently delay this teardown indefinitely,
> > > defeating the purpose of expedited reclamation.
> > >
> > > In Android's LMKD scenarios, this delay keeps memory pressure high, forcing
> > > the system to unnecessarily kill additional innocent background apps before
> > > the memory from the first victim is recovered.
> > >
> > > This patch introduces the PROCESS_MRELEASE_REAP_KILL UAPI flag to support
> > > an integrated auto-kill mode. When specified, process_mrelease() directly
> > > injects a SIGKILL into the target task after finding its mm.
> > >
> > > To solve the race condition, we grab the mm reference via mmgrab() before
> > > sending the SIGKILL. If the user passed PROCESS_MRELEASE_REAP_KILL, we assume
> > > it will free its memory and proceed with reaping, making the logic as simple
> > > as reap = reap_kill || task_will_free_mem(p).
> > >
> > > To handle shared address spaces safely in the auto-kill mode, we bail out
> > > immediately if the mm is marked with MMF_MULTIPROCESS when
> > > PROCESS_MRELEASE_REAP_KILL is specified. This protects existing users of
> > > process_mrelease() from behavior changes while preventing unsafe reaping of
> > > shared memory.
> >
> > Please explain why this is a different behavior from the global oom
> > killer and how do you intend to deal with those mm shared process
> > groups. I am not saying this is a wrong behavior but it will be hard to
> > change once in place.
> >
> > > Fundamentally, this allows process_mrelease() to trigger targeted memory
> > > reclaim (via oom_reaper infrastructure) quickly, even if the victim is
> > > not yet in the exit path, while reusing existing race handling between
> > > reaper and exit_mmap.
> > >
> > > Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>
> >
> > Other than the above looks ok to me.
>
> Implementation looks good to me. After addressing Michal's comment
> please split this patch from the series and feel free to add:
>
> Reviewed-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>

I just posted v2 with your Reviewd-by.
https://lore.kernel.org/linux-mm/20260429211359.3829683-1-minchan@xxxxxxxxxx/

Thanks, Suren.