Re: [PATCH] KVM: guest_memfd: Elaborate on how release() vs. get_pfn() is safe against UAF
From: Ackerley Tng
Date: Tue Jun 02 2026 - 20:40:09 EST
Yan Zhao <yan.y.zhao@xxxxxxxxx> writes:
Coming here from https://lore.kernel.org/all/ahn6qysOGfa74A2E@xxxxxxxxxx/.
> On Thu, Nov 13, 2025 at 03:22:29PM -0800, Sean Christopherson wrote:
>> Add more context and information to the comment in kvm_gmem_release() that
>> explains why there's no synchronization on RCU _or_ kvm->srcu. Point (b)
>> from commit 67b43038ce14 ("KVM: guest_memfd: Remove RCU-protected attribute
>> from slot->gmem.file")
>>
>> b) kvm->srcu ensures that kvm_gmem_unbind() and freeing of a memslot
>> occur after the memslot is no longer visible to kvm_gmem_get_pfn().
>>
>> is especially difficult to fully grok, particularly in light of commit
>> ae431059e75d ("KVM: guest_memfd: Remove bindings on memslot deletion when
>> gmem is dying"), which addressed a race between unbind() and release().
> As mentioned in commit ae431059e75d ("KVM: guest_memfd: Remove bindings on
> memslot deletion when gmem is dying"), unbind() and release() are mutually
> exclusive, i.e., both protected by slots_lock, mentioning "race" here is
> confusing. So, that commit addressed the mishandling in unbind() after
> kvm_gmem_get_file() returning NULL?
>
I agree that the use of the word "race" here is odd since both unbind
and release() are mutually exclusive and protected by slots_lock.
Looking at ae431059e75d again, there are really 3 changes:
1. Refactoring out __kvm_gmem_unbind(), which is mostly about setting
bindings in the xarray to NULL and setting slot->gmem.file to NULL,
2. Checking that if slot->gmem.file is NULL, skip unbinding
3. Checking that if kvm_gmem_get_file() returns NULL, just unbind
without taking filemap_invalidate_lock().
(2.) is explained in the comment
/*
* Nothing to do if the underlying file was _already_ closed, as
* kvm_gmem_release() invalidates and nullifies all bindings.
*/
The real fix is in (3.), I think? Could you explain how
kvm_gmem_get_file() works? Why would the file be dying with the
slots_lock making the unbind and release mutually exclusive?
>> No functional change intended.
>>
>> Cc: Yan Zhao <yan.y.zhao@xxxxxxxxx>
>> Cc: Vishal Annapurve <vannapurve@xxxxxxxxxx>
>> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
>> ---
>> virt/kvm/guest_memfd.c | 20 ++++++++++++++------
>> 1 file changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
>> index fdaea3422c30..2e09d7ec0cfc 100644
>> --- a/virt/kvm/guest_memfd.c
>> +++ b/virt/kvm/guest_memfd.c
>> @@ -338,17 +338,25 @@ static int kvm_gmem_release(struct inode *inode, struct file *file)
>> * dereferencing the slot for existing bindings needs to be protected
>> * against memslot updates, specifically so that unbind doesn't race
>> * and free the memslot (kvm_gmem_get_file() will return NULL).
>> - *
>> - * Since .release is called only when the reference count is zero,
>> - * after which file_ref_get() and get_file_active() fail,
>> - * kvm_gmem_get_pfn() cannot be using the file concurrently.
>> - * file_ref_put() provides a full barrier, and get_file_active() the
>> - * matching acquire barrier.
>> */
>
>> mutex_lock(&kvm->slots_lock);
>>
>> filemap_invalidate_lock(inode->i_mapping);
>>
>> + /*
>> + * Note! synchronize_srcu() is _not_ needed after nullifying memslot
>> + * bindings as slot->gmem.file cannot be set back to a non-null value
>> + * without the memslot first being deleted. I.e. this relies on the
>> + * synchronize_srcu_expedited() in kvm_swap_active_memslots() to ensure
>> + * kvm_gmem_get_pfn() (which runs with kvm->srcu held for read) can't
>> + * grab a reference to slot->gmem.file even if the struct file object
>> + * is reallocated.
> Do you mean that
> as kvm_gmem_get_pfn() can't find a stale slot, it can't grab reference to stale
> slot->gmem.file, even if slot->gmem.file has been set to a different value,
> i.e., after invoking unbind(), bind() ?
>
> But I'm not sure why to put the kvm_gmem_get_pfn() relying on
> synchronize_srcu_expedited() in kvm_swap_active_memslots() in the comment of
> release().
This is a good question, I think the comment can be improved to explain
this.
> Without the guard of kvm_gmem_get_file(), kvm_gmem_get_pfn() may need to
> provide some RCU read-critial section too for release() to wait by
> synchronize_srcu()?
>
> So
> * Since .release is called only when the reference count is zero,
> * after which file_ref_get() and get_file_active() fail,
> is still helpful?
>
>> + *
>> + * file_ref_put() provides a full barrier, and __get_file_rcu() the
>> + * matching acquire barrier, to ensure that kvm_gmem_get_file() (via
>> + * __get_file_rcu()) sees refcount==0 or fails the "file reloaded"
>> + * check (file != NULL due to nullifying the file pointer here).
>> + */
>
>> xa_for_each(&f->bindings, index, slot)
>> WRITE_ONCE(slot->gmem.file, NULL);
>>
>>
>> base-commit: 16ec4fb4ac95d878b879192d280db2baeec43272
>> --
>> 2.52.0.rc1.455.g30608eb744-goog
>>