RE: [PATCH v2] kvm/arm64: fixed passthrough gpu into vm on arm64

From: Tian, Kevin
Date: Wed Apr 06 2022 - 08:26:04 EST

> From: Jason Gunthorpe <jgg@xxxxxxxx>
> Sent: Tuesday, April 5, 2022 1:02 AM
> On Mon, Apr 04, 2022 at 03:47:11PM +0100, Marc Zyngier wrote:
> > > I'm guessing it will turn into a SBSA like thing where the ARM ARM is
> > > kind of vauge but a SOC has to implement Normal-NC in a certain way to
> > > be functional for the server market.
> >
> > The main issue is that this equivalence isn't architected, so people
> > can build whatever they want. SBSA means nothing to KVM (or Linux at
> > large), and there is currently no way to describe which devices are
> > safe to map as Normal-NC vs Device.
> And people have, we know of some ARM SOC's that don't work fully with
> NORMAL_NC for this usage. That is already a problem for baremetal
> Linux, let alone KVM..
> That is why I likened it to SBSA - if you want to build a server SOC
> that works with existing server software, you have to support
> NORMAL_NC in this way. Even if it isn't architected.
> The KVM challenge, at least, is to support a CPU with working
> NORMAL_NC to create VM that emulates the same CPU with working
> I didn't quite understand your other remarks though - is there a
> problem here? It seems like yes from the other thread you pointed at?
> I would think that KVM should mirror the process page table
> configuration into the KVM page table and make this into a userspace
> problem?
> That turns it into a VFIO problem to negotiate with userspace and set
> the proper pgprot. At least VFIO has a better chance than KVM to
> consult DT or something to learn about the specific device's
> properties.
> I don't know how VFIO/qemu/etc can make this all work automatically
> correctly 100% of the time. It seems to me it is the same problem as
> just basic baremetal "WC" is troubled on ARM in general today. Maybe
> some tables and a command line option in qemu is the best we can hope
> for.

Not knowing those ARM details and how they differ from x86. Just FYI
how it works on Intel platform.

If no assigned device the KVM page table (EPT) PTE is always set to
'forced WB' in a way overriding the guest cache attributes.

If having assigned device, EPT PTE is set to:

1) UC for mmio regions. The effective memory type is UC or WC
depending on guest attributes;

2) forced WB for memory pages if the device cannot do noncoherent
dma. Guest cache attributes are overridden (same as no device assigned);

3) a type making the guest cache attributes effective for memory
pages if noncoherent dma is possible.

All above logic is contained in vmx_get_mt_mask().