Re: [PATCH 2/2] kvm: arm: Unify handling THP backed host memory

From: Zenghui Yu
Date: Fri Apr 12 2019 - 05:37:17 EST



On 2019/4/11 23:16, Suzuki K Poulose wrote:
Hi Zhengui,

On 11/04/2019 02:59, Zenghui Yu wrote:
Hi Suzuki,

On 2019/4/10 23:23, Suzuki K Poulose wrote:
We support mapping host memory backed by PMD transparent hugepages
at stage2 as huge pages. However the checks are now spread across
two different places. Let us unify the handling of the THPs to
keep the code cleaner (and future proof for PUD THP support).
This patch moves transparent_hugepage_adjust() closer to the caller
to avoid a forward declaration for fault_supports_stage2_huge_mappings().

Also, since we already handle the case where the host VA and the guest
PA may not be aligned, the explicit VM_BUG_ON() is not required.

Cc: Marc Zyngier <marc.zyngier@xxxxxxx>
Cc: Christoffer Dall <christoffer.dall@xxxxxxx>
Cc: Zneghui Yu <yuzenghui@xxxxxxxxxx>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
---
ÂÂ virt/kvm/arm/mmu.c | 123 +++++++++++++++++++++++++++--------------------------
ÂÂ 1 file changed, 62 insertions(+), 61 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 6d73322..714eec2 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1380,53 +1380,6 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
ÂÂÂÂÂÂ return ret;
ÂÂ }
-static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
-{
-ÂÂÂ kvm_pfn_t pfn = *pfnp;
-ÂÂÂ gfn_t gfn = *ipap >> PAGE_SHIFT;
-ÂÂÂ struct page *page = pfn_to_page(pfn);
-
-ÂÂÂ /*
-ÂÂÂÂ * PageTransCompoundMap() returns true for THP and
-ÂÂÂÂ * hugetlbfs. Make sure the adjustment is done only for THP
-ÂÂÂÂ * pages.
-ÂÂÂÂ */
-ÂÂÂ if (!PageHuge(page) && PageTransCompoundMap(page)) {
-ÂÂÂÂÂÂÂ unsigned long mask;
-ÂÂÂÂÂÂÂ /*
-ÂÂÂÂÂÂÂÂ * The address we faulted on is backed by a transparent huge
- * page. However, because we map the compound huge page and
-ÂÂÂÂÂÂÂÂ * not the individual tail page, we need to transfer the
- * refcount to the head page. We have to be careful that the
-ÂÂÂÂÂÂÂÂ * THP doesn't start to split while we are adjusting the
-ÂÂÂÂÂÂÂÂ * refcounts.
-ÂÂÂÂÂÂÂÂ *
-ÂÂÂÂÂÂÂÂ * We are sure this doesn't happen, because mmu_notifier_retry
-ÂÂÂÂÂÂÂÂ * was successful and we are holding the mmu_lock, so if this
-ÂÂÂÂÂÂÂÂ * THP is trying to split, it will be blocked in the mmu
-ÂÂÂÂÂÂÂÂ * notifier before touching any of the pages, specifically
-ÂÂÂÂÂÂÂÂ * before being able to call __split_huge_page_refcount().
-ÂÂÂÂÂÂÂÂ *
-ÂÂÂÂÂÂÂÂ * We can therefore safely transfer the refcount from PG_tail
-ÂÂÂÂÂÂÂÂ * to PG_head and switch the pfn from a tail page to the head
-ÂÂÂÂÂÂÂÂ * page accordingly.
-ÂÂÂÂÂÂÂÂ */
-ÂÂÂÂÂÂÂ mask = PTRS_PER_PMD - 1;
-ÂÂÂÂÂÂÂ VM_BUG_ON((gfn & mask) != (pfn & mask));
-ÂÂÂÂÂÂÂ if (pfn & mask) {
-ÂÂÂÂÂÂÂÂÂÂÂ *ipap &= PMD_MASK;
-ÂÂÂÂÂÂÂÂÂÂÂ kvm_release_pfn_clean(pfn);
-ÂÂÂÂÂÂÂÂÂÂÂ pfn &= ~mask;
-ÂÂÂÂÂÂÂÂÂÂÂ kvm_get_pfn(pfn);
-ÂÂÂÂÂÂÂÂÂÂÂ *pfnp = pfn;
-ÂÂÂÂÂÂÂ }
-
-ÂÂÂÂÂÂÂ return true;
-ÂÂÂ }
-
-ÂÂÂ return false;
-}
-
ÂÂ /**
ÂÂÂ * stage2_wp_ptes - write protect PMD range
ÂÂÂ * @pmd:ÂÂÂ pointer to pmd entry
@@ -1677,6 +1630,61 @@ static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
ÂÂÂÂÂÂÂÂÂÂÂÂÂ (hva & ~(map_size - 1)) + map_size <= uaddr_end;
ÂÂ }
+/*
+ * Check if the given hva is backed by a transparent huge page (THP)
+ * and whether it can be mapped using block mapping in stage2. If so, adjust
+ * the stage2 PFN and IPA accordingly. Only PMD_SIZE THPs are currently
+ * supported. This will need to be updated to support other THP sizes.
+ *
+ * Returns the size of the mapping.
+ */
+static unsigned long
+transparent_hugepage_adjust(struct kvm_memory_slot *memslot,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ unsigned long hva, kvm_pfn_t *pfnp,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ phys_addr_t *ipap)
+{
+ÂÂÂ kvm_pfn_t pfn = *pfnp;
+ÂÂÂ struct page *page = pfn_to_page(pfn);
+
+ÂÂÂ /*
+ÂÂÂÂ * PageTransCompoundMap() returns true for THP and
+ÂÂÂÂ * hugetlbfs. Make sure the adjustment is done only for THP
+ÂÂÂÂ * pages. Also make sure that the HVA and IPA are sufficiently
+ * aligned and that the block map is contained within the memslot.
+ÂÂÂÂ */
+ÂÂÂ if (!PageHuge(page) && PageTransCompoundMap(page) &&

We managed to get here, ensure that we only play with normal size pages
and no hugetlbfs pages will be involved. "!PageHuge(page)" will always
return true and we can let it go.

I think that is a bit tricky. If someone ever modifies the user_mem_abort()
and we end up in getting called with a HugeTLB backed page things could go
wrong.

That will be bad. I'm not sure if it's possible in the future.

I could do remove the check, but would like to add a WARN_ON_ONCE() to make
sure our assumption is held.

i.e,
ÂÂÂÂWARN_ON_ONCE(PageHuge(page));

But this is a careful approach. I think this will be valuable both for
developers and the code itself. Thanks!


zenghui


ÂÂÂÂif (PageTransCompoundMap(page) &&>> + fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {

...