Re: [PATCH v4 1/2] kvm/arm64: Remove the creation time's mapping of MMIO regions

From: Gavin Shan
Date: Thu Apr 22 2021 - 19:35:25 EST


Hi Keqian,

On 4/22/21 5:41 PM, Keqian Zhu wrote:
On 2021/4/22 10:12, Gavin Shan wrote:
On 4/21/21 4:28 PM, Keqian Zhu wrote:
On 2021/4/21 14:38, Gavin Shan wrote:
On 4/16/21 12:03 AM, Keqian Zhu wrote:

[...]


Yeah, Sorry that I missed that part. Something associated with Santosh's
patch. The flag can be not existing until the page fault happened on
the vma. In this case, the check could be not working properly.

[PATCH] KVM: arm64: Correctly handle the mmio faulting
Yeah, you are right.

If that happens, we won't try to use block mapping for memslot with VM_PFNMAP.
But it keeps a same logic with old code.

1. When without dirty-logging, we won't try block mapping for it, and we'll
finally know that it's device, so won't try to do adjust THP (Transparent Huge Page)
for it.
2. If userspace wrongly enables dirty logging for this memslot, we'll force_pte for it.


It's not about the patch itself and just want more discussion to get more details.
The patch itself looks good to me. I got two questions as below:

(1) The memslot fails to be added if it's backed by MMIO region and dirty logging is
enabled in kvm_arch_prepare_memory_region(). As Santosh reported, the corresponding
vma could be associated with MMIO region and VM_PFNMAP is missed. In this case,
kvm_arch_prepare_memory_region() isn't returning error, meaning the memslot can be
added successfully and block mapping isn't used, as you mentioned. The question is
the memslot is added, but the expected result would be failure.

(2) If dirty logging is enabled on the MMIO memslot, everything should be fine. If
the dirty logging isn't enabled and VM_PFNMAP isn't set yet in user_mem_abort(),
block mapping won't be used and PAGE_SIZE is picked, but the failing IPA might
be good candidate for block mapping. It means we miss something for blocking
mapping?

By the way, do you have idea why dirty logging can't be enabled on MMIO memslot?
I guess Marc might know the history. For example, QEMU is taking "/dev/mem" or
"/dev/kmem" to back guest's memory, the vma is marked as MMIO, but dirty logging
and migration isn't supported?

Thanks,
Gavin