Re: [RFC PATCH] userfaultfd: allow registration of ranges below mmap_min_addr
From: Denis M. Karpov
Date: Thu Apr 09 2026 - 05:14:57 EST
(to Harry)
> Technically, it's less restrictive only if start < mmap_min_addr
> (setting aside the discussion of whether this is an appropriate check).
>
> Otherwise (start >= mmap_min_addr) it's more restrictive? (now, the process
> should have the capability when registering an existing VMA to userfaultfd)
Hmm, I can't find any checks for addr >= mmap_min_addr in the security
subsystem,
only if addr < mmap_min_addr. Otherwise, one would need capabilities for
regular mmap() calls as well.
> Was it a private discussion? I can't find Andrea's emails on the thread.
Oh, it seems Andrea accidentally dropped some recipients from the CC list.
I have CC'd him here so he can clarify his points if he feels it is necessary.
(to Lorenzo)
>Duplicating this kind of logic in the already horribly duplicative (and more
>generally, horrible) UFFD implementation is actively buggy and incorrect IMO.
So, no security_mmap_addr check, no FIRST_USER_ADDRESS check.
Thank you both for the review. I'll prepare the patch.
On Thu, Apr 9, 2026 at 11:01 AM Lorenzo Stoakes <ljs@xxxxxxxxxx> wrote:
>
> On Wed, Apr 08, 2026 at 05:36:59AM -0700, Usama Arif wrote:
> > On Tue, 7 Apr 2026 11:14:42 +0300 "Denis M. Karpov" <komlomal@xxxxxxxxx> wrote:
> >
> > > The current implementation of validate_range() in fs/userfaultfd.c
> > > performs a hard check against mmap_min_addr without considering
> > > capabilities, but the mmap() syscall uses security_mmap_addr()
> > > which allows privileged processes (with CAP_SYS_RAWIO) to map below
> > > mmap_min_addr. Furthermore, security_mmap_addr()->cap_mmap_addr() uses
> > > dac_mmap_min_addr variable which can be changed with
> > > /proc/sys/vm/mmap_min_addr.
> > >
> > > Because userfaultfd uses a different check, UFFDIO_REGISTER may fail
> > > with -EINVAL for valid memory areas that were successfully mapped
> > > below mmap_min_addr even with appropriate capabilities.
> > >
> > > This prevents apps like binary compilers from using UFFD for valid memory
> > > regions mapped by application.
> > >
> > > Replace the rigid mmap_min_addr check with security_mmap_addr() to align
> > > userfaultfd with the standard kernel memory mapping security policy.
> > >
> > > Signed-off-by: Denis M. Karpov <komlomal@xxxxxxxxx>
> > >
> > > ---
> > > Initial RFC following the discussion on the [BUG] thread.
> > > Link: https://lore.kernel.org/all/CADtiZd0tWysx5HMCUnOXfSHB7PXAuXg1Mh4eY_hUmH29S=sejg@xxxxxxxxxxxxxx/
> > > ---
> > > fs/userfaultfd.c | 4 +---
> > > 1 file changed, 1 insertion(+), 3 deletions(-)
> > >
> > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> > > index bdc84e521..dbfe5b2a0 100644
> > > --- a/fs/userfaultfd.c
> > > +++ b/fs/userfaultfd.c
> > > @@ -1238,15 +1238,13 @@ static __always_inline int validate_unaligned_range(
> > > return -EINVAL;
> > > if (!len)
> > > return -EINVAL;
> > > - if (start < mmap_min_addr)
> > > - return -EINVAL;
> > > if (start >= task_size)
> > > return -EINVAL;
> > > if (len > task_size - start)
> > > return -EINVAL;
> > > if (start + len <= start)
> > > return -EINVAL;
> > > - return 0;
> > > + return security_mmap_addr(start);
> >
> > Is this introducing an ABI change?
> >
> > The old code returned -EINVAL when start was below mmap_min_addr.
> > The new code calls security_mmap_addr() which returns -EPERM when
> > the caller lacks CAP_SYS_RAWIO. Existing userspace callers checking
> > specifically for -EINVAL would see different behavior start is
> > below mmap_min_addr.
>
> You mean API change? :) we don't guarantee ABI for kernel stuff anyway.
>
> Firstly, as with Harry, I don't believe we should be duplicating checks here
> anyway. UFFD is duplicative enough as it is.
>
> And this is such a silly edge case that I don't think it is valid or reasonable
> for us to account for whichever totally insane user relies on a pointless
> re-check being done there and _then_ relies on the error code
> being... -EINVAL... which is overloaded for a million other possible failures.
>
> Let's let it be -EFAULT and remove this silly check altogether.
>
> >
> > > }
> > >
> > > static __always_inline int validate_range(struct mm_struct *mm,
> > > --
> > > 2.47.3
> > >
> > >
>
> Thanks, Lorenzo