Re: [PATCH 5/5] iommufd: Flush CPU caches on DMA pages in non-coherent domains

From: Jason Gunthorpe
Date: Tue May 21 2024 - 12:05:05 EST


On Mon, May 20, 2024 at 10:45:56AM +0800, Yan Zhao wrote:
> On Fri, May 17, 2024 at 02:04:18PM -0300, Jason Gunthorpe wrote:
> > On Thu, May 16, 2024 at 10:32:43AM +0800, Yan Zhao wrote:
> > > On Wed, May 15, 2024 at 05:43:04PM -0300, Jason Gunthorpe wrote:
> > > > On Wed, May 15, 2024 at 03:06:36PM +0800, Yan Zhao wrote:
> > > >
> > > > > > So it has to be calculated on closer to a page by page basis (really a
> > > > > > span by span basis) if flushing of that span is needed based on where
> > > > > > the pages came from. Only pages that came from a hwpt that is
> > > > > > non-coherent can skip the flushing.
> > > > > Is area by area basis also good?
> > > > > Isn't an area either not mapped to any domain or mapped into all domains?
> > > >
> > > > Yes, this is what the span iterator turns into in the background, it
> > > > goes area by area to cover things.
> > > >
> > > > > But, yes, considering the limited number of non-coherent domains, it appears
> > > > > more robust and clean to always flush for non-coherent domain in
> > > > > iopt_area_fill_domain().
> > > > > It eliminates the need to decide whether to retain the area flag during a split.
> > > >
> > > > And flush for pin user pages, so you basically always flush because
> > > > you can't tell where the pages came from.
> > > As a summary, do you think it's good to flush in below way?
> > >
> > > 1. in iopt_area_fill_domains(), flush before mapping a page into domains when
> > > iopt->noncoherent_domain_cnt > 0, no matter where the page is from.
> > > Record cache_flush_required in pages for unpin.
> > > 2. in iopt_area_fill_domain(), pass in hwpt to check domain non-coherency.
> > > flush before mapping a page into a non-coherent domain, no matter where the
> > > page is from.
> > > Record cache_flush_required in pages for unpin.
> > > 3. in batch_unpin(), flush if pages->cache_flush_required before
> > > unpin_user_pages.
> >
> > It does not quite sound right, there should be no tracking in the
> > pages of this stuff.
> What's the downside of having tracking in the pages?

Well, a counter doesn't make sense. You could have a single sticky bit
that indicates that all PFNs are coherency dirty and overflush them on
every map and unmap operation.

This is certainly the simplest option, but gives the maximal flushes.

If you want to minimize flushes then you can't store flush
minimization information in the pages because it isn't global to the
pages and will not be accurate enough.

> > If pfn_reader_fill_span() does batch_from_domain() and
> > the source domain's storage_domain is non-coherent then you can skip
> > the flush. This is not pedantically perfect in skipping all flushes, but
> > in practice it is probably good enough.

> We don't know whether the source storage_domain is non-coherent since
> area->storage_domain is of "struct iommu_domain".

> Do you want to add a flag in "area", e.g. area->storage_domain_is_noncoherent,
> and set this flag along side setting storage_domain?

Sure, that could work.

> > __iopt_area_unfill_domain() (and children) must flush after
> > iopt_area_unmap_domain_range() if the area's domain is
> > non-coherent. This is also not perfect, but probably good enough.
> Do you mean flush after each iopt_area_unmap_domain_range() if the domain is
> non-coherent?
> The problem is that iopt_area_unmap_domain_range() knows only IOVA, the
> IOVA->PFN relationship is not available without iommu_iova_to_phys() and
> iommu_domain contains no coherency info.

Yes, you'd have to read back the PFNs on this path which it doesn't do
right now.. Given this pain it would be simpler to have one bit in the
pages that marks it permanently non-coherent and all pfns will be
flushed before put_page is called.

The trouble with a counter is that the count going to zero doesn't
really mean we flushed the PFN if it is being held someplace else.

Jason