[PATCH v13 55/85] KVM: arm64: Mark "struct page" pfns accessed/dirty before dropping mmu_lock

From: Sean Christopherson
Date: Thu Oct 10 2024 - 14:43:47 EST


Mark pages/folios accessed+dirty prior to dropping mmu_lock, as marking a
page/folio dirty after it has been written back can make some filesystems
unhappy (backing KVM guests will such filesystem files is uncommon, and
the race is minuscule, hence the lack of complaints).

While scary sounding, practically speaking the worst case scenario is that
KVM would trigger this WARN in filemap_unaccount_folio():

/*
* At this point folio must be either written or cleaned by
* truncate. Dirty folio here signals a bug and loss of
* unwritten data - on ordinary filesystems.
*
* But it's harmless on in-memory filesystems like tmpfs; and can
* occur when a driver which did get_user_pages() sets page dirty
* before putting it, while the inode is being finally evicted.
*
* Below fixes dirty accounting after removing the folio entirely
* but leaves the dirty flag set: it has no effect for truncated
* folio and anyway will be cleared before returning folio to
* buddy allocator.
*/
if (WARN_ON_ONCE(folio_test_dirty(folio) &&
mapping_can_writeback(mapping)))
folio_account_cleaned(folio, inode_to_wb(mapping->host));

KVM won't actually write memory because the stage-2 mappings are protected
by the mmu_notifier, i.e. there is no risk of loss of data, even if the
VM were backed by memory that needs writeback.

See the link below for additional details.

This will also allow converting arm64 to kvm_release_faultin_page(), which
requires that mmu_lock be held (for the aforementioned reason).

Link: https://lore.kernel.org/all/cover.1683044162.git.lstoakes@xxxxxxxxx
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
arch/arm64/kvm/mmu.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index dd221587fcca..ecc6c2b56c43 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1692,15 +1692,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
}

out_unlock:
+ if (writable && !ret)
+ kvm_release_pfn_dirty(pfn);
+ else
+ kvm_release_pfn_clean(pfn);
+
read_unlock(&kvm->mmu_lock);

/* Mark the page dirty only if the fault is handled successfully */
- if (writable && !ret) {
- kvm_set_pfn_dirty(pfn);
+ if (writable && !ret)
mark_page_dirty_in_slot(kvm, memslot, gfn);
- }

- kvm_release_pfn_clean(pfn);
return ret != -EAGAIN ? ret : 0;
}

--
2.47.0.rc1.288.g06298d1525-goog