RE: [PATCH 4/5] vfio/type1: Flush CPU caches on DMA pages in non-coherent domains

From: Tian, Kevin
Date: Wed May 22 2024 - 19:41:14 EST


> From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Sent: Thursday, May 23, 2024 7:32 AM
>
> On Wed, May 22, 2024 at 11:26:21PM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > > Sent: Wednesday, May 22, 2024 8:30 PM
> > >
> > > On Wed, May 22, 2024 at 06:24:14AM +0000, Tian, Kevin wrote:
> > > > I'm fine to do a special check in the attach path to enable the flush
> > > > only for Intel GPU.
> > >
> > > We already effectively do this already by checking the domain
> > > capabilities. Only the Intel GPU will have a non-coherent domain.
> > >
> >
> > I'm confused. In earlier discussions you wanted to find a way to not
> > publish others due to the check of non-coherent domain, e.g. some
> > ARM SMMU cannot force snoop.
> >
> > Then you and Alex discussed the possibility of reducing pessimistic
> > flushes by virtualizing the PCI NOSNOOP bit.
> >
> > With that in mind I was thinking whether we explicitly enable this
> > flush only for Intel GPU instead of checking non-coherent domain
> > in the attach path, since it's the only device with such requirement.
>
> I am suggesting to do both checks:
> - If the iommu domain indicates it has force coherency then leave PCI
> no-snoop alone and no flush
> - If the PCI NOSNOOP bit is or can be 0 then no flush
> - Otherwise flush

How to judge whether PCI NOSNOOP can be 0? If following PCI spec
it can always be set to 0 but then we break the requirement for Intel
GPU. If we explicitly exempt Intel GPU in 2nd check then what'd be
the value of doing that generic check?