Re: [RFC, PATCH 00/12] userfaultfd: working set tracking for VM guest memory

From: Kiryl Shutsemau

Date: Thu Apr 16 2026 - 16:25:30 EST


On Thu, Apr 16, 2026 at 08:32:19PM +0200, David Hildenbrand (Arm) wrote:
> On 4/16/26 15:49, Kiryl Shutsemau wrote:
> > On Tue, Apr 14, 2026 at 06:10:44PM +0100, Kiryl Shutsemau wrote:
> >> On Tue, Apr 14, 2026 at 05:37:50PM +0200, David Hildenbrand (Arm) wrote:
> >>>
> >>> I would rather tackle this from the other direction: it's another form
> >>> of protection (like WP), not really a "minor" mode.
> >>>
> >>> Could we add a UFFDIO_REGISTER_MODE_RWP (or however we would call it)
> >>> and support it for anon+shmem, avoiding the zapping for shmem completely?
> >>
> >> I like this idea.
> >>
> >> It should be functionally equivalent, but your interface idea fits
> >> better with the rest.
> >>
> >> Thanks! Will give it a try.
> >
> > Here is an updated version:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git/log/?h=uffd/rfc-v2
> >
> > will post after -rc1 is tagged.
> >
> > I like it more. It got substantially cleaner.
>
> I don't have time to look into the details just yet, but my thinking was
> that
>
> a) It would avoid the zap+refault

Yep.

> b) We could reuse the uffd-wp PTE bit + marker to indicate/remember the
> protection, making it co-exist with NUMA hinting naturally.
>
> b) obviously means that we cannot use uffd-wp and uffd-rwp at the same
> time in the same uffd area. I guess that should be acceptable for the
> use cases we you should have in mind?

I took a different path: I still use PROT_NONE PTEs, so it cannot
co-exist with NUMA balancing [fully], but WP + RWP should be fine. I
need to add a test for this.

I didn't give up on NUMA balancing completely. task_numa_fault() is
called on RWP fault. So it should help scheduler decisions somewhat.

I think an RWP user might want to use WP too.

Do you see this trade-off as reasonable?

> But I also haven't taken a closer look at this patch set, whether you
> would already be using a PTE bit somehow (I suspect not :) )

No. I didn't want to allocate a new bit or invent some arch-specific
trick for this. This functionality is available everywhere where
PAGE_NONE exists.

--
Kiryl Shutsemau / Kirill A. Shutemov