Hi Suzuki
I will merge the other thread into this, and add the necessary CC list
That WARN_ON call trace is very easy to reproduce in my armv8a server after I start 20 guests
and run memhog in the host. Of course, ksm should be enabled
For you question about my inject fault debug patch:
index 7f6a944..ab8545e 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -290,12 +290,17 @@ static void unmap_stage2_puds(struct kvm *kvm, pgd_t *pgd,
 * destroying the VM), otherwise another faulting VCPU may come in and mess
 * with things behind our backs.
 */
+extern int trigger_by_ksm;
Âstatic void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
Â{
ÂÂÂÂÂÂÂ pgd_t *pgd;
ÂÂÂÂÂÂÂ phys_addr_t addr = start, end = start + size;
ÂÂÂÂÂÂÂ phys_addr_t next;
+ÂÂÂÂÂÂ if(trigger_by_ksm) {
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ end -= 0x200;
+ÂÂÂÂÂÂ }
+
ÂÂÂÂÂÂÂ assert_spin_locked(&kvm->mmu_lock);
ÂÂÂÂÂÂÂ pgd = kvm->arch.pgd + stage2_pgd_index(addr);
ÂÂÂÂÂÂÂ do {
I need to point out that I never reproduced it without this debugging patch.
Suzuki, thanks for the comments.Thanks for the pointer. In the future, please Cc the people relevant to the
I proposed another ksm patch https://lkml.org/lkml/2018/5/3/1042
The root cause is ksm will add some extra flags to indicate that the page
is in/not_in the stable tree. This makes address not be aligned with PAGE_SIZE.
discussion in the patches.
 From arm kvm mmu point of view, do you think handle_hva_to_gpa still need to handleI don't think we should do that. Had we done this, we would never have caught this bug
the unalignment case?
in KSM. Eventually if some other new implementation comes up with the a new notifier
consumer which doesn't check alignment and doesn't WARN, it could simply do the wrong
thing. So I believe what we have is a good measure to make sure that things are
in the right order.
IMO, the PAGE_SIZE alignment is still needed because we should not let the bottom function From an API perspective, you are passed on a "start" and "end" address. So, you could potentially
kvm_age_hva_handler to handle the exception. Please refer to the implementation in X86 and
powerpc kvm_handle_hva_range(). They both aligned the hva with hva_to_gfn_memslot.
do the wrong thing if you align the "start" and "end". May be those handlers should also do the
same thing as we do.
But handle_hva_to_gpa has partially adjusted the alignment possibly:
ÂÂ 1750ÂÂÂÂÂÂÂÂ kvm_for_each_memslot(memslot, slots) {
ÂÂ 1751ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ unsigned long hva_start, hva_end;
ÂÂ 1752ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ gfn_t gpa;
ÂÂ 1753
ÂÂ 1754ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ hva_start = max(start, memslot->userspace_addr);
ÂÂ 1755ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ hva_end = min(end, memslot->userspace_addr +
ÂÂ 1756ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ (memslot->npages << PAGE_SHIFT));
at line 1755, let us assume that end=0x12340200 and
memslot->userspace_addr + (memslot->npages << PAGE_SHIFT)=0x12340000
Then, hva_start is not page_size aligned and hva_end is aligned, and the size will be PAGE_SIZE-0x200,
just as what I had done in the inject fault debugging patch.