Re: [PATCH 6.15] mm/vma: add give_up_on_oom option on modify/merge, use in uffd release

From: Pedro Falcato
Date: Mon Mar 31 2025 - 11:36:42 EST


On Mon, Mar 31, 2025 at 04:10:41PM +0100, Lorenzo Stoakes wrote:
> I know that none of us love this, but seemed to be consensus that this was
> a viable, if semi-vom-inducing solution - can we go ahead with this?

/me barfs

> Would appreciate ack's (even if queasy) if so, so this doesn't get
> stalled. We can always revisit this (in fact, it's on my list...).
>
> On Fri, Mar 21, 2025 at 10:09:37AM +0000, Lorenzo Stoakes wrote:
> > Currently, if a VMA merge fails due to an OOM condition arising on commit
> > merge or a failure to duplicate anon_vma's, we report this so the caller
> > can handle it.
> >
> > However there are cases where the caller is only ostensibly trying a
> > merge, and doesn't mind if it fails due to this condition.
> >
> > Since we do not want to introduce an implicit assumption that we only
> > actually modify VMAs after OOM conditions might arise, add a 'give up on
> > oom' option and make an explicit contract that, should this flag be set, we
> > absolutely will not modify any VMAs should OOM arise and just bail out.
> >
> > Since it'd be very unusual for a user to try to vma_modify() with this flag
> > set but be specifying a range within a VMA which ends up being split (which
> > can fail due to rlimit issues, not only OOM), we add a debug warning for
> > this condition.
> >
> > The motivating reason for this is uffd release - syzkaller (and Pedro
> > Falcato's VERY astute analysis) found a way in which an injected fault on
> > allocation, triggering an OOM condition on commit merge, would result in
> > uffd code becoming confused and treating an error value as if it were a VMA
> > pointer.
> >
> > To avoid this, we make use of this new VMG flag to ensure that this never
> > occurs, utilising the fact that, should we be clearing entire VMAs, we do
> > not wish an OOM event to be reported to us.
> >
> > Many thanks to Pedro Falcato for his excellent analysis and Jann Horn for
> > his insightful and intelligent analysis of the situation, both of whom were
> > instrumental in this fix.
> >
> > Reported-by: syzbot+20ed41006cf9d842c2b5@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Closes: https://lore.kernel.org/all/67dc67f0.050a0220.25ae54.001e.GAE@xxxxxxxxxx/
> > Fixes: 47b16d0462a4 ("mm: abort vma_modify() on merge out of memory failure")
> > Suggested-by: Pedro Falcato <pfalcato@xxxxxxx>
> > Suggested-by: Jann Horn <jannh@xxxxxxxxxx>
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>

Alright, I'm not a huge fan of the solution, but if you feel like it's the best course of action,
I'll trust your instincts. The patch itself LGTM.

Reviewed-by: Pedro Falcato <pfalcato@xxxxxxx>

> > if (vma->vm_start < start) {
> > int err = split_vma(vmg->vmi, vma, start, 1);
> > @@ -1602,12 +1642,15 @@ struct vm_area_struct
> > struct vm_area_struct *vma,
> > unsigned long start, unsigned long end,
> > unsigned long new_flags,
> > - struct vm_userfaultfd_ctx new_ctx)
> > + struct vm_userfaultfd_ctx new_ctx,
> > + bool give_up_on_oom)
> > {
> > VMG_VMA_STATE(vmg, vmi, prev, vma, start, end);
> >
> > vmg.flags = new_flags;
> > vmg.uffd_ctx = new_ctx;
> > + if (give_up_on_oom)
> > + vmg.give_up_on_oom = true;

Why not just
vmg.give_up_on_oom = give_up_on_oom;
with no if?

--
Pedro