Re: [PATCH v3 2/2] kvm: arm64: set io memory s2 pte as normalnc for vfio pci devices

From: Catalin Marinas
Date: Mon Jan 08 2024 - 06:05:19 EST


On Fri, Jan 05, 2024 at 08:42:31PM +0000, Oliver Upton wrote:
> On Thu, Dec 21, 2023 at 01:19:18PM +0000, Catalin Marinas wrote:
> > > Apologies, I didn't mean to question what's going on here from the
> > > hardware POV. My concern was more from the kernel + user interfaces POV,
> > > this all seems to work (specifically for PCI) by maintaining an
> > > intentional mismatch between the VFIO stage-1 and KVM stage-2 mappings.
> >
> > If you stare at it long enough, the mismatch starts to look fine ;).
> > Even if you have the VFIO stage 1 Normal NC, KVM stage 2 Normal NC, you
> > can still have the guest setting stage 1 to Device and introduce an
> > architectural mismatch. These aliases have some bad reputation but the
> > behaviour is constrained architecturally.
> >
> > IMHO we should move on from this attribute mismatch since we can't fully
> > solve it anyway and focus instead on what the device, system can
> > tolerate, who's responsible for deciding which MMIO ranges can be mapped
> > as Normal NC.
>
> Fair enough :) The other slightly unsavory part is that we're baking
> the mapping policy into KVM. I'd prefer it if this policy were kept in
> userspace somehow, but there's no actual usecase for userspace selecting
> memory attributes at this point.

If by policy you mean who's deciding the write-combining relaxation,
this series moved it to the vfio-pci host driver. KVM only picks the
appropriate memory type for stage 2 based on the vma flags. That's
Normal NC in the absence of anything better on arm64 and it does more
than just write-combining but we can describe what this new VM_* flag
allows.

If we want to keep this decision strictly in user space, we can do it
with some ioctl(). The downside is that the host kernel now puts more
trust in the user VMM, so my preference would be to keep this in the
vfio driver. Or we can do both, vfio-pci allows the relaxation, the VMM
tells KVM to go for a more relaxed stage 2 via an ioctl().

--
Catalin