Re: (sashiko review) [PATCH v1 1/1] mm/damon: support MADV_COLLAPSE via DAMOS_COLLAPSE scheme action

From: SeongJae Park

Date: Mon Mar 30 2026 - 20:02:24 EST


+Cc madvise developers (maintainers and reviewers).

On Mon, 30 Mar 2026 16:43:37 -0700 SeongJae Park <sj@xxxxxxxxxx> wrote:

> Forwarding Sashiko review for doing discussions via mails.
>
> # review url: https://sashiko.dev/#/patchset/20260330145758.2115502-1-gutierrez.asier@xxxxxxxxxxxxxxxxxxx
> # start of sashiko.dev inline review
> commit cd78854e55ee90102333f0331c58417c16e8d798
> Author: Asier Gutierrez <gutierrez.asier@xxxxxxxxxxxxxxxxxxx>
>
> mm/damon: support MADV_COLLAPSE via DAMOS_COLLAPSE scheme action
>
> This commit introduces the DAMOS_COLLAPSE action to DAMON. It allows DAMON
> to synchronously collapse memory regions into huge pages by leveraging the
> existing MADV_COLLAPSE mechanism.
>
> Link: https://lore.kernel.org/damon/20260313000816.79933-1-sj@xxxxxxxxxx/
>
> [ ... ]
>
> > diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
> > index b069dbc7e3d2..dd5f2d7027ac 100644
> > --- a/mm/damon/vaddr.c
> > +++ b/mm/damon/vaddr.c
> > @@ -903,6 +903,9 @@ static unsigned long damon_va_apply_scheme(struct damon_ctx *ctx,
> > case DAMOS_NOHUGEPAGE:
> > madv_action = MADV_NOHUGEPAGE;
> > break;
> > + case DAMOS_COLLAPSE:
> > + madv_action = MADV_COLLAPSE;
>
> Could exposing MADV_COLLAPSE to automated continuous execution via DAMOS
> trigger a pre-existing use-after-free in the core madvise logic?
>
> In madvise_collapse(), the lock_dropped boolean tracks whether the lock is
> currently dropped and tells madvise_walk_vmas() if the lock was ever dropped.
>
> If collapse_single_pmd() drops the mmap lock, *lock_dropped is set to true.
> On the next loop iteration, madvise_collapse() re-acquires the lock,
> validates the VMA, and resets *lock_dropped = false to avoid double-locking.
> If a subsequent iteration does not drop the lock, the function returns to
> madvise_walk_vmas() with lock_dropped set to false.
>
> Because madvise_walk_vmas() sees false, it assumes the lock was continuously
> held and the original VMA pointer is still safe. However, another thread
> could have freed the original VMA while the lock was temporarily dropped,
> leading to a use-after-free when madvise_walk_vmas() dereferences
> vma->vm_end.
>
> While this flaw pre-dates this patch,

It sounds like sashiko is saying the bug still exists and not yet fixed.
madivse() deveopers, could you please confirm if it is true?

> does introducing DAMOS_COLLAPSE expose
> it to continuous background execution by kdamond, making the race condition
> much more likely to occur in practice?

DAMOS_COLLAPSE simply calls do_madvise(), so I think it shouldn't be different
from user space programs calling madvise(MADV_COLLAPSE). Hence I don't think
this patch is making something worse.

Nonetheless, (if my undertanding of sashiko's argument is not a hallucination
but a real), I agree this change could help the bug triggered more frequently
if DAMOS_COLLAPSE is adopted by more people faster than the bugfix is merged.
If that's the canse and it concerns madvise() developers, I think we can hold
this patch or adjust the scheudles to ensure this patch is merged only after
the MADV_COLLAPSE bug fix. madvise() developers, please let us know if you
think such hold or schedule adjustment is needed.


Thanks,
SJ

[...]