Re: [PATCH v2 1/1] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags

From: David Hildenbrand
Date: Tue Jan 14 2025 - 08:18:12 EST


On 13.01.25 17:27, Jason Gunthorpe wrote:
On Fri, Jan 10, 2025 at 09:15:53PM +0000, Ankit Agrawal wrote:
This patch solves the problems where it is possible for the kernel to
have VMAs pointing at cachable memory without causing
pfn_is_map_memory() to be true, eg DAX memremap cases and CXL/pre-CXL
devices. This memory is now properly marked as cachable in KVM.

Does this only imply in worse performance, or does this also affect
correctness? I suspect performance is the problem, correct?

Correctness. Things like atomics don't work on non-cachable mappings.

Hah! This needs to be highlighted in the patch description. And maybe
this even implies Fixes: etc?

Understood. I'll put that in the patch description.

Likely you assume to never end up with COW VM_PFNMAP -- I think it's
possible when doing a MAP_PRIVATE /dev/mem mapping on systems that allow
for mapping /dev/mem. Maybe one could just reject such cases (if KVM PFN
lookup code not already rejects them, which might just be that case IIRC).

At least VFIO enforces SHARED or it won't create the VMA.

drivers/vfio/pci/vfio_pci_core.c:       if ((vma->vm_flags & VM_SHARED) == 0)

That makes a lot of sense for VFIO.

So, I suppose we don't need to check this? Specially if we only extend the
changes to the following case:

I would check if it is a VM_PFNMAP, and outright refuse any page if is_cow_mapping(vma->vm_flags) is true.

- type is VM_PFNMAP &&
- user mapping is cacheable (MT_NORMAL or MT_NORMAL_TAGGED) &&
- The suggested VM_FORCE_CACHED is set.

Do we really want another weirdly defined VMA flag? I'd really like to
avoid this..

Agreed.

Can't we do a "this is a weird VM_PFNMAP thing, let's consult the VMA prot + whatever PFN information to find out if it is weird-device and how we could safely map it?"

Ideally, we'd separate this logic from the "this is a normal VMA that doesn't need any such special casing", and even stop playing PFN games on these normal VMAs completely.


How is the VFIO going to know any better if it should set
the flag when the questions seem to be around things like MTE that
have nothing to do with VFIO?

I assume MTE does not apply at all to VM_PFNMAP, at least arch_calc_vm_flag_bits() tells me that VM_MTE_ALLOWED should never get set there.

So for VFIO and friends with VM_PFNMAP mapping, we can completely ignore that maybe?

--
Cheers,

David / dhildenb