Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release()
From: Fuad Tabba
Date: Tue May 12 2026 - 04:08:29 EST
On Mon, 11 May 2026 at 21:25, Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Mon, May 11, 2026, Fuad Tabba wrote:
> > On Mon, 11 May 2026 at 15:58, Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > >
> > > On Mon, May 11, 2026, Fuad Tabba wrote:
> > > > kvm_vm_release() closes vmp->fd and vmp->kvm_fd unconditionally, and
> > > > kvm_vm_free() calls kvm_vm_release() at teardown. A test that calls
> > > > kvm_vm_release() and then kvm_vm_free() without a
> > > > vm_recreate_with_one_vcpu() in between double-closes both FDs.
> > >
> > > That's a test bug, no?
> >
> > Fair point. Let me describe the actual use case and then I'd like
> > your guidance on which way to fix it.
> >
> > The pKVM selftests I'm preparing need to verify that protected guest
> > memory is zeroed when the VM is destroyed before the host regains
> > visibility of those pages. The test pattern is:
> >
> > vm_userspace_mem_region_add(vm, ..., GPA_BASE, 1, GPA_PAGES, 0);
> > /* run protected guest, which writes/shares/unshares its memory */
> > ...
> > TEST_ASSERT(get_proc_locked_vm_size() > GPA_SIZE); /* still pinned */
> >
> > kvm_vm_release(vm); /* trigger destroy */
> >
> > /* host_mem is still mmap'd in the test process; pKVM should have
> > * zeroed it before unpinning */
> > TEST_ASSERT(is_zero(region->host_mem, GPA_PAGES * PAGE_SIZE));
> >
> > kvm_vm_free(vm); /* tidy up */
> > TEST_ASSERT(get_proc_locked_vm_size() == 0);
> >
> > I used kvm_vm_release() because it's the only public API that closes
> > vm->fd to trigger kernel-side destruction. But the existing callers
> > follow it with vm_recreate_with_one_vcpu(), so the "release + later
> > kvm_vm_free()" path isn't exercised today.
> >
> > I see three ways to make this clean:
> > a) This patch: kvm_vm_release() becomes idempotent for its three
> > FDs, matching the kvm_stats_release() idiom it already invokes.
> > b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g.
> > kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel
> > destruction while leaving the kvm_vm struct intact for
> > post-destruction inspection. kvm_vm_free() learns to handle the
> > half-released state.
> > c) Something else entirely, e.g., the test should manage vm->fd
> > directly and not rely on library helpers for this pattern.
>
> d) Fully kill the VM; validate the semantics with an explict mmap().
>
> The entire point of the test you are writing is to verfiy that a guest_memfd VMA
> doesn't somehow cause KVM to leak state. So, make that obvious instead of abusing
> APIs that kinda sorta do what you want, but not really.
>
> mem = kvm_mmap(region->mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED,
> region->guest_memfd);
>
> ...
>
> kvm_vm_free(vm);
>
> TEST_ASSERT(is_zero(mem, ...));
The test isn't about guest_memfd. The pKVM support that just landed
via Will's series [1] is built on anonymous host memory, and that's
the only backing this selftest exercises (VM_MEM_SRC_ANONYMOUS and
VM_MEM_SRC_ANONYMOUS_THP). The property under test is the one
documented by the same series in Documentation/virt/kvm/arm/pkvm.rst:
Status: Isolation of anonymous memory and metadata pages.
...
Reclaimed memory is zeroed by the hypervisor ...
The test mirrors that contract: assert VmLck > donated size before
teardown (pages pinned), trigger kernel-side destruction, assert the
host VMA reads as zero (hypervisor scrubbed on reclaim), then assert
VmLck == 0 (pages unpinned).
kvm_mmap() + kvm_vm_free() + is_zero() doesn't translate here. The only
host view of the donated pages is the memslot mmap, and kvm_vm_free()
munmaps it on the way out, so inspection has to happen between
kernel-side destruction and userspace free. kvm_vm_release() is the
only library primitive that does that today.
What do you suggest?
Cheers,
/fuad
[1] https://lore.kernel.org/all/20260105154939.11041-1-will@xxxxxxxxxx/