Re: [RFC 10/20] iommu/iommufd: Add IOMMU_DEVICE_GET_INFO
From: Jason Gunthorpe
Date: Thu Sep 30 2021 - 18:24:02 EST
On Thu, Sep 30, 2021 at 09:35:45AM +0000, Tian, Kevin wrote:
> > The Intel functional issue is that Intel blocks the cache maintaince
> > ops from the VM and the VM has no way to self-discover that the cache
> > maintaince ops don't work.
>
> the VM doesn't need to know whether the maintenance ops
> actually works.
Which is the whole problem.
Intel has a design where the device driver tells the device to issue
non-cachable TLPs.
The driver is supposed to know if it can issue the cache maintaince
instructions - if it can then it should ask the device to issue
no-snoop TLPs.
For instance the same PCI driver on non-x86 should never ask the
device to issue no-snoop TLPs because it has no idea how to restore
cache coherence on eg ARM.
Do you see the issue? This configuration where the hypervisor silently
make wbsync a NOP breaks the x86 architecture because the guest has no
idea it can no longer use no-snoop features.
Using the IOMMU to forcibly prevent the device from issuing no-snoop
makes this whole issue of the broken wbsync moot.
It is important to be really clear on what this is about - this is not
some idealized nice iommu feature - it is working around alot of
backwards compatability baggage that is probably completely unique to
x86.
> > Other arches don't seem to have this specific problem...
>
> I think the key is whether other archs allow driver to decide DMA
> coherency and indirectly the underlying I/O page table format.
> If yes, then I don't see a reason why such decision should not be
> given to userspace for passthrough case.
The choice all comes down to if the other arches have cache
maintenance instructions in the VM that *don't work*
Jason