Re: [RFC PATCH v2 00/37] guest_memfd: In-place conversion support
From: Lisa Wang
Date: Fri Feb 20 2026 - 04:09:19 EST
On Mon, Feb 02, 2026 at 02:36:37PM -0800, Ackerley Tng wrote:
> (resending to fix Message-ID)
>
> Here's a second revision of guest_memfd In-place conversion support.
>
> In this version, other than addressing comments from RFCv1 [1], the largest
> change is that guest_memfd now does not avoid participation in LRU; it
> participates in LRU by joining the unevictable list (no change from before this
> series).
>
> While checking for elevated refcounts during shared to private conversions,
> guest_memfd will now do an lru_add_drain_all() if elevated refcounts were found,
> before concluding that there are true users of the shared folio and erroring
> out.
>
> I'd still like feedback on these points, if any:
>
> 1. Having private/shared status stored in a maple tree (Thanks Michael for your
> support of using maple trees over xarrays for performance! [5]).
> 2. Having a new guest_memfd ioctl (not a vm ioctl) that performs conversions.
> 3. Using ioctls/structs/input attribute similar to the existing vm ioctl
> KVM_SET_MEMORY_ATTRIBUTES to perform conversions.
> 4. Storing requested attributes directly in the maple tree.
> 5. Using a KVM module-wide param to toggle between setting memory attributes via
> vm and guest_memfd ioctls (making them mututally exclusive - a single loaded
> KVM module can only do one of the two.).
>
> [...snip...]
>
>
> --
> 2.53.0.rc1.225.gd81095ad13-goog
I’ve tested memory failure handling after applying this series and here’s what
memory_failure() does:
Shared memory: In line with other in-memory filesystems, the memory_failure()
handler unmaps the page if it is currently mapped, and issues a SIGBUS
- if memory failure was injected with MF_ACTION_REQUIRED or
- if the test process’s memory corruption kill policy is PR_MCE_KILL_EARLY
Here’s the above, in table form:
| MF_ACTION_REQUIRED | Kill Policy | Mapped | Dirty | Result: SIGBUS |
|--------------------|---------------------|--------|-------|----------------|
| false | PR_MCE_KILL_EARLY | true | true | true |
| false | PR_MCE_KILL_EARLY | true | false | false |
| false | PR_MCE_KILL_EARLY | false | true | false |
| false | PR_MCE_KILL_EARLY | false | false | false |
| false | PR_MCE_KILL_LATE | true | true | false |
| false | PR_MCE_KILL_LATE | true | false | false |
| false | PR_MCE_KILL_LATE | false | true | false |
| false | PR_MCE_KILL_LATE | false | false | false |
| true | Any Policy | true | true | true |
| true | Any Policy | true | false | false |
(I used MADV_HWPOISON to inject memory failures with MF_ACTION_REQUIRED set, and
there was no way to use MADV_HWPOISON without first mapping the page in. To
inject memory failures without MF_ACTION_REQUIRED set, I used debugfs’
hwpoison/corrupt-pfn.)
Private memory: The handler unmaps the page for the stage 2 page table and does
not issue a SIGBUS - the page is never mapped to the host, since it is private
to the guest.
| MF_ACTION_REQUIRED | Kill Policy | Mapped | Dirty | Result: SIGBUS |
|--------------------|---------------------|--------|-------|----------------|
| false | PR_MCE_KILL_EARLY | false | true | false |
| false | PR_MCE_KILL_EARLY | false | false | false |
| false | PR_MCE_KILL_LATE | false | true | false |
| false | PR_MCE_KILL_LATE | false | false | false |
(I couldn’t use MADV_HWPOISON since private memory could not be mapped and hence
will not have a userspace address)
I’ll post updated memory failure tests together with the next revision of this
series [1] to fix MF_DELAYED handling on memory failure.
[1] https://lore.kernel.org/all/cover.1760551864.git.wyihan@xxxxxxxxxx/T/