Re: [RFC PATCH 4/6] KVM: guest_memfd: Implemnet bmap inode operation

From: Sean Christopherson
Date: Wed Sep 13 2023 - 13:46:18 EST


On Wed, Sep 13, 2023, isaku.yamahata@xxxxxxxxx wrote:
> From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
>
> To inject memory failure, physical address of the page is needed.
> Implement bmap() method to convert the file offset into physical address.
>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> ---
> virt/kvm/Kconfig | 4 ++++
> virt/kvm/guest_mem.c | 28 ++++++++++++++++++++++++++++
> 2 files changed, 32 insertions(+)
>
> diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> index 624df45baff0..eb008f0e7cc3 100644
> --- a/virt/kvm/Kconfig
> +++ b/virt/kvm/Kconfig
> @@ -115,3 +115,7 @@ config KVM_GENERIC_PRIVATE_MEM
>
> config HAVE_GENERIC_PRIVATE_MEM_HANDLE_ERROR
> bool
> +
> +config KVM_GENERIC_PRIVATE_MEM_BMAP
> + depends on KVM_GENERIC_PRIVATE_MEM
> + bool
> diff --git a/virt/kvm/guest_mem.c b/virt/kvm/guest_mem.c
> index 3678287d7c9d..90dfdfab1f8c 100644
> --- a/virt/kvm/guest_mem.c
> +++ b/virt/kvm/guest_mem.c
> @@ -355,12 +355,40 @@ static int kvm_gmem_error_page(struct address_space *mapping, struct page *page)
> return MF_DELAYED;
> }
>
> +#ifdef CONFIG_KVM_GENERIC_PRIVATE_MEM_BMAP
> +static sector_t kvm_gmem_bmap(struct address_space *mapping, sector_t block)
> +{
> + struct folio *folio;
> + sector_t pfn = 0;
> +
> + filemap_invalidate_lock_shared(mapping);
> +
> + if (block << PAGE_SHIFT > i_size_read(mapping->host))
> + goto out;
> +
> + folio = filemap_get_folio(mapping, block);
> + if (IS_ERR_OR_NULL(folio))
> + goto out;
> +
> + pfn = folio_pfn(folio) + (block - folio->index);
> + folio_put(folio);
> +
> +out:
> + filemap_invalidate_unlock_shared(mapping);
> + return pfn;

IIUC, hijacking bmap() is a gigantic hack to propagate a host pfn to userspace
without adding a new ioctl() or syscall. If we want to support target injection,
I would much, much rather add a KVM ioctl(), e.g. to let userspace inject errors
for a gfn. Returning a pfn for something that AFAICT has nothing to do with pfns
is gross, e.g. the whole "0 is the error code" thing is technically wrong because
'0' is a perfectly valid pfn.

My vote is to drop this and not extend the injection information for the initial
merge, i.e. rely on point testing to verify kvm_gmem_error_page(), and defer adding
uAPI to let selftests inject errors.

> +
> +}
> +#endif
> +
> static const struct address_space_operations kvm_gmem_aops = {
> .dirty_folio = noop_dirty_folio,
> #ifdef CONFIG_MIGRATION
> .migrate_folio = kvm_gmem_migrate_folio,
> #endif
> .error_remove_page = kvm_gmem_error_page,
> +#ifdef CONFIG_KVM_GENERIC_PRIVATE_MEM_BMAP
> + .bmap = kvm_gmem_bmap,
> +#endif
> };
>
> static int kvm_gmem_getattr(struct mnt_idmap *idmap,
> --
> 2.25.1
>