[PATCH 4.9 155/157] KVM: do not assume PTE is writable after follow_pfn

From: Greg Kroah-Hartman
Date: Mon Jan 24 2022 - 14:08:59 EST


From: Paolo Bonzini <pbonzini@xxxxxxxxxx>

commit bd2fae8da794b55bf2ac02632da3a151b10e664c upstream.

In order to convert an HVA to a PFN, KVM usually tries to use
the get_user_pages family of functinso. This however is not
possible for VM_IO vmas; in that case, KVM instead uses follow_pfn.

In doing this however KVM loses the information on whether the
PFN is writable. That is usually not a problem because the main
use of VM_IO vmas with KVM is for BARs in PCI device assignment,
however it is a bug. To fix it, use follow_pte and check pte_write
while under the protection of the PTE lock. The information can
be used to fail hva_to_pfn_remapped or passed back to the
caller via *writable.

Usage of follow_pfn was introduced in commit add6a0cd1c5b ("KVM: MMU: try to fix
up page faults before giving up", 2016-07-05); however, even older version
have the same issue, all the way back to commit 2e2e3738af33 ("KVM:
Handle vma regions with no backing page", 2008-07-20), as they also did
not check whether the PFN was writable.

Fixes: 2e2e3738af33 ("KVM: Handle vma regions with no backing page")
Reported-by: David Stevens <stevensd@xxxxxxxxxx>
Cc: 3pvd@xxxxxxxxxx
Cc: Jann Horn <jannh@xxxxxxxxxx>
Cc: Jason Gunthorpe <jgg@xxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
[OP: backport to 4.19, adjust follow_pte() -> follow_pte_pmd()]
Signed-off-by: Ovidiu Panait <ovidiu.panait@xxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
[bwh: Backport to 4.9: follow_pte_pmd() does not take start or end
parameters]
Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
virt/kvm/kvm_main.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1519,9 +1519,11 @@ static int hva_to_pfn_remapped(struct vm
kvm_pfn_t *p_pfn)
{
unsigned long pfn;
+ pte_t *ptep;
+ spinlock_t *ptl;
int r;

- r = follow_pfn(vma, addr, &pfn);
+ r = follow_pte_pmd(vma->vm_mm, addr, &ptep, NULL, &ptl);
if (r) {
/*
* get_user_pages fails for VM_IO and VM_PFNMAP vmas and does
@@ -1536,14 +1538,19 @@ static int hva_to_pfn_remapped(struct vm
if (r)
return r;

- r = follow_pfn(vma, addr, &pfn);
+ r = follow_pte_pmd(vma->vm_mm, addr, &ptep, NULL, &ptl);
if (r)
return r;
+ }

+ if (write_fault && !pte_write(*ptep)) {
+ pfn = KVM_PFN_ERR_RO_FAULT;
+ goto out;
}

if (writable)
- *writable = true;
+ *writable = pte_write(*ptep);
+ pfn = pte_pfn(*ptep);

/*
* Get a reference here because callers of *hva_to_pfn* and
@@ -1558,6 +1565,8 @@ static int hva_to_pfn_remapped(struct vm
*/
kvm_get_pfn(pfn);

+out:
+ pte_unmap_unlock(ptep, ptl);
*p_pfn = pfn;
return 0;
}