Hi Zhengui,
On 15/03/2019 08:21, Zheng Xiang wrote:
Hi Suzuki,
I have tested this patch, VM doesn't hang and we get expected WARNING log:
Thanks for the quick testing !
However, we also get the following unexpected log:
[ 908.329900] BUG: Bad page state in process qemu-kvm pfn:a2fb41cf
[Â 908.339415] page:ffff7e28bed073c0 count:-4 mapcount:0 mapping:0000000000000000 index:0x0
[Â 908.339416] flags: 0x4ffffe0000000000()
[Â 908.339418] raw: 4ffffe0000000000 dead000000000100 dead000000000200 0000000000000000
[Â 908.339419] raw: 0000000000000000 0000000000000000 fffffffcffffffff 0000000000000000
[Â 908.339420] page dumped because: nonzero _refcount
[Â 908.339437] CPU: 32 PID: 72599 Comm: qemu-kvm Kdump: loaded Tainted: GÂÂÂ BÂ WÂÂÂÂÂÂÂ 5.0.0+ #1
[Â 908.339438] Call trace:
[Â 908.339439]Â dump_backtrace+0x0/0x188
[Â 908.339441]Â show_stack+0x24/0x30
[Â 908.339442]Â dump_stack+0xa8/0xcc
[Â 908.339443]Â bad_page+0xf0/0x150
[Â 908.339445]Â free_pages_check_bad+0x84/0xa0
[Â 908.339446]Â free_pcppages_bulk+0x4b8/0x750
[Â 908.339448]Â free_unref_page_commit+0x13c/0x198
[Â 908.339449]Â free_unref_page+0x84/0xa0
[Â 908.339451]Â __free_pages+0x58/0x68
[Â 908.339452]Â zap_huge_pmd+0x290/0x2d8
[Â 908.339454]Â unmap_page_range+0x2b4/0x470
[Â 908.339455]Â unmap_single_vma+0x94/0xe8
[Â 908.339457]Â unmap_vmas+0x8c/0x108
[Â 908.339458]Â exit_mmap+0xd4/0x178
[Â 908.339459]Â mmput+0x74/0x180
[Â 908.339460]Â do_exit+0x2b4/0x5b0
[Â 908.339462]Â do_group_exit+0x3c/0xe0
[Â 908.339463]Â __arm64_sys_exit_group+0x24/0x28
[Â 908.339465]Â el0_svc_common+0xa0/0x180
[Â 908.339466]Â el0_svc_handler+0x38/0x78
[Â 908.339467]Â el0_svc+0x8/0xc
Thats bad, we seem to be making upto 4 unbalanced put_page().
---
ÂÂ virt/kvm/arm/mmu.c | 51 +++++++++++++++++++++++++++++++++++----------------
ÂÂ 1 file changed, 35 insertions(+), 16 deletions(-)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 66e0fbb5..04b0f9b 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1076,24 +1076,38 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
ÂÂÂÂÂÂÂÂÂÂÂ * Skip updating the page table if the entry is
ÂÂÂÂÂÂÂÂÂÂÂ * unchanged.
ÂÂÂÂÂÂÂÂÂÂÂ */
-ÂÂÂÂÂÂÂ if (pmd_val(old_pmd) == pmd_val(*new_pmd))
+ÂÂÂÂÂÂÂ if (pmd_val(old_pmd) == pmd_val(*new_pmd)) {
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return 0;
-
+ÂÂÂÂÂÂÂ } else if (WARN_ON_ONCE(!pmd_thp_or_huge(old_pmd))) {
ÂÂÂÂÂÂÂÂÂÂ /*
-ÂÂÂÂÂÂÂÂ * Mapping in huge pages should only happen through a
- * fault. If a page is merged into a transparent huge
-ÂÂÂÂÂÂÂÂ * page, the individual subpages of that huge page
-ÂÂÂÂÂÂÂÂ * should be unmapped through MMU notifiers before we
-ÂÂÂÂÂÂÂÂ * get here.
-ÂÂÂÂÂÂÂÂ *
-ÂÂÂÂÂÂÂÂ * Merging of CompoundPages is not supported; they
-ÂÂÂÂÂÂÂÂ * should become splitting first, unmapped, merged,
-ÂÂÂÂÂÂÂÂ * and mapped back in on-demand.
+ÂÂÂÂÂÂÂÂ * If we have PTE level mapping for this block,
+ÂÂÂÂÂÂÂÂ * we must unmap it to avoid inconsistent TLB
+ÂÂÂÂÂÂÂÂ * state. We could end up in this situation if
+ÂÂÂÂÂÂÂÂ * the memory slot was marked for dirty logging
+ÂÂÂÂÂÂÂÂ * and was reverted, leaving PTE level mappings
+ÂÂÂÂÂÂÂÂ * for the pages accessed during the period.
+ÂÂÂÂÂÂÂÂ * Normal THP split/merge follows mmu_notifier
+ÂÂÂÂÂÂÂÂ * callbacks and do get handled accordingly.
ÂÂÂÂÂÂÂÂÂÂÂ */
-ÂÂÂÂÂÂÂ VM_BUG_ON(pmd_pfn(old_pmd) != pmd_pfn(*new_pmd));
+ÂÂÂÂÂÂÂÂÂÂÂ unmap_stage2_range(kvm, (addr & S2_PMD_MASK), S2_PMD_SIZE);
It seems that kvm decreases the _refcount of the page twice in transparent_hugepage_adjust()
and unmap_stage2_range().
But I thought we should be doing that on the head_page already, as this is THP.
I will take a look and get back to you on this. Btw, is it possible for you
to turn on CONFIG_DEBUG_VM and re-run with the above patch ?
Kind regards
Suzuki