Re: [PATCH v2 5/7] KVM: guest_memfd: Add cleanup interface for guest teardown

From: Kalra, Ashish

Date: Tue Mar 10 2026 - 18:19:02 EST

Hello Ackerley,

On 3/9/2026 4:01 AM, Ackerley Tng wrote:
> Ashish Kalra <Ashish.Kalra@xxxxxxx> writes:
>
>> From: Ashish Kalra <ashish.kalra@xxxxxxx>
>>
>> Introduce kvm_arch_gmem_cleanup() to perform architecture-specific
>> cleanups when the last file descriptor for the guest_memfd inode is
>> closed. This typically occurs during guest shutdown and termination
>> and allows for final resource release.
>>
>> Signed-off-by: Ashish Kalra <ashish.kalra@xxxxxxx>
>> ---
>>
>> [...snip...]
>>
>> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
>> index 017d84a7adf3..2724dd1099f2 100644
>> --- a/virt/kvm/guest_memfd.c
>> +++ b/virt/kvm/guest_memfd.c
>> @@ -955,6 +955,14 @@ static void kvm_gmem_destroy_inode(struct inode *inode)
>>
>> static void kvm_gmem_free_inode(struct inode *inode)
>> {
>> +#ifdef CONFIG_HAVE_KVM_ARCH_GMEM_CLEANUP
>> + /*
>> + * Finalize cleanup for the inode once the last guest_memfd
>> + * reference is released. This usually occurs after guest
>> + * termination.
>> + */
>> + kvm_arch_gmem_cleanup();
>> +#endif
>
> Folks have already talked about the performance implications of doing
> the scan and rmpopt, I just want to call out that one VM could have more
> than one associated guest_memfd too.

Yes, i have observed that kvm_gmem_free_inode() gets invoked multiple times
at SNP guest shutdown.

And the same is true for kvm_gmem_destroy_inode() too.

>
> I think the cleanup function should be thought of as cleanup for the
> inode (even if it doesn't take an inode pointer since it's not (yet)
> required).
>
> So, the gmem cleanup function should not handle deduplicating cleanup
> requests, but the arch function should, if the cleanup needs
> deduplicating.

I agree, the arch function will have to handle deduplicating, and for that
the arch function will probably need to be passed the inode pointer,
to have a parameter to assist with deduplicating.

>
> Also, .free_inode() is called through RCU, so it could be called after
> some delay. Could it be possible that .free_inode() ends up being called
> way after the associated VM gets torn down, or after KVM the module gets
> unloaded? Does rmpopt still work fine if KVM the module got unloaded?

Yes, .free_inode() can probably get called after the associated VM has
been torn down and which should be fine for issuing RMPOPT to do
RMP re-optimizations.

As far as about KVM module getting unloaded, then as part of the forthcoming patch-series,
during KVM module unload, X86_SNP_SHUTDOWN would be issued which means SNP would get
disabled and therefore, RMP checks are also disabled.

And as CC_ATTR_HOST_SEV_SNP would then be cleared, therefore, snp_perform_rmp_optimization()
will simply return.

Another option is to add a new guest_memfd superblock operation, and then do the
final guest_memfd cleanup using the .evict_inode() callback. This will then ensure
that the cleanup is not called through RCU and avoids any kind of delays, as following:

+static void kvm_gmem_evict_inode(struct inode *inode)
+{
+#ifdef CONFIG_HAVE_KVM_ARCH_GMEM_CLEANUP
+ kvm_arch_gmem_cleanup();
+#endif
+ truncate_inode_pages_final(&inode->i_data);
+ clear_inode(inode);
+}
+

@@ -971,6 +979,7 @@ static const struct super_operations kvm_gmem_super_operations = {
.alloc_inode = kvm_gmem_alloc_inode,
.destroy_inode = kvm_gmem_destroy_inode,
.free_inode = kvm_gmem_free_inode,
+ .evict_inode = kvm_gmem_evict_inode,
};

Thanks,
Ashish

>
> IIUC the current kmem_cache_free(kvm_gmem_inode_cachep, GMEM_I(inode));
> is fine because in kvm_gmem_exit(), there is a rcu_barrier() before
> kmem_cache_destroy(kvm_gmem_inode_cachep);.
>
>> kmem_cache_free(kvm_gmem_inode_cachep, GMEM_I(inode));
>> }
>>
>> --
>> 2.43.0