Re: [RFC, PATCH 00/12] userfaultfd: working set tracking for VM guest memory
From: Kiryl Shutsemau
Date: Tue Apr 14 2026 - 13:14:11 EST
On Tue, Apr 14, 2026 at 05:37:50PM +0200, David Hildenbrand (Arm) wrote:
> On 4/14/26 16:23, Kiryl Shutsemau (Meta) wrote:
> > This series adds userfaultfd support for tracking the working set of
> > VM guest memory, enabling VMMs to identify cold pages and evict them
> > to tiered or remote storage.
> >
> > == Problem ==
> >
> > VMMs managing guest memory need to:
> > 1. Track which pages are actively used (working set detection)
> > 2. Safely evict cold pages to slower storage
> > 3. Fetch pages back on demand when accessed again
> >
> > For shmem-backed guest memory, working set tracking partially works
> > today: MADV_DONTNEED zaps PTEs while pages stay in page cache, and
> > re-access auto-resolves from cache. But safe eviction still requires
> > synchronous fault interception to prevent data loss races.
> >
> > For anonymous guest memory (needed for KSM cross-VM deduplication),
> > there is no mechanism at all — clearing a PTE loses the page.
> >
> > == Solution ==
> >
> > The series introduces a unified userfaultfd interface that works
> > across both anonymous and shmem-backed memory:
> >
> > UFFD_FEATURE_MINOR_ANON: extends MODE_MINOR registration to anonymous
> > private memory. Uses the PROT_NONE hinting mechanism (same as NUMA
> > balancing) to make pages inaccessible without freeing them.
>
> I would rather tackle this from the other direction: it's another form
> of protection (like WP), not really a "minor" mode.
>
> Could we add a UFFDIO_REGISTER_MODE_RWP (or however we would call it)
> and support it for anon+shmem, avoiding the zapping for shmem completely?
I like this idea.
It should be functionally equivalent, but your interface idea fits
better with the rest.
Thanks! Will give it a try.
--
Kiryl Shutsemau / Kirill A. Shutemov