Re: [RFC PATCH 39/39] KVM: guest_memfd: Dynamically split/reconstruct HugeTLB page

From: Yan Zhao
Date: Thu Apr 03 2025 - 08:36:03 EST


On Tue, Sep 10, 2024 at 11:44:10PM +0000, Ackerley Tng wrote:
> +/*
> + * Allocates and then caches a folio in the filemap. Returns a folio with
> + * refcount of 2: 1 after allocation, and 1 taken by the filemap.
> + */
> +static struct folio *kvm_gmem_hugetlb_alloc_and_cache_folio(struct inode *inode,
> + pgoff_t index)
> +{
> + struct kvm_gmem_hugetlb *hgmem;
> + pgoff_t aligned_index;
> + struct folio *folio;
> + int nr_pages;
> + int ret;
> +
> + hgmem = kvm_gmem_hgmem(inode);
> + folio = kvm_gmem_hugetlb_alloc_folio(hgmem->h, hgmem->spool);
> + if (IS_ERR(folio))
> + return folio;
> +
> + nr_pages = 1UL << huge_page_order(hgmem->h);
> + aligned_index = round_down(index, nr_pages);
Maybe a gap here.

When a guest_memfd is bound to a slot where slot->base_gfn is not aligned to
2M/1G and slot->gmem.pgoff is 0, even if an index is 2M/1G aligned, the
corresponding GFN is not 2M/1G aligned.

However, TDX requires that private huge pages be 2M aligned in GFN.

> + ret = kvm_gmem_hugetlb_filemap_add_folio(inode->i_mapping, folio,
> + aligned_index,
> + htlb_alloc_mask(hgmem->h));
> + WARN_ON(ret);
> +
> spin_lock(&inode->i_lock);
> inode->i_blocks += blocks_per_huge_page(hgmem->h);
> spin_unlock(&inode->i_lock);
>
> - return page_folio(requested_page);
> + return folio;
> +}