Re: [PATCH -next] memregion: Add arch_flush_memregion() interface

From: Andrew Morton
Date: Thu Sep 08 2022 - 19:00:45 EST


On Thu, 8 Sep 2022 15:51:50 -0700 Dan Williams <dan.j.williams@xxxxxxxxx> wrote:

> Jonathan Cameron wrote:
> > On Wed, 7 Sep 2022 18:07:31 -0700
> > Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> >
> > > Andrew Morton wrote:
> > > > I really dislike the term "flush". Sometimes it means writeback,
> > > > sometimes it means invalidate. Perhaps at other times it means
> > > > both.
> > > >
> > > > Can we please be very clear in comments and changelogs about exactly
> > > > what this "flush" does. With bonus points for being more specific in the
> > > > function naming?
> > > >
> > >
> > > That's a good point, "flush" has been cargo-culted along in Linux's
> > > cache management APIs to mean write-back-and-invalidate. In this case I
> > > think this API is purely about invalidate. It just so happens that x86
> > > has not historically had a global invalidate instruction readily
> > > available which leads to the overuse of wbinvd.
> > >
> > > It would be nice to make clear that this API is purely about
> > > invalidating any data cached for a physical address impacted by address
> > > space management event (secure erase / new region provision). Write-back
> > > is an unnecessary side-effect.
> > >
> > > So how about:
> > >
> > > s/arch_flush_memregion/cpu_cache_invalidate_memregion/?
> >
> > Want to indicate it 'might' write back perhaps?
> > So could be invalidate or clean and invalidate (using arm ARM terms just to add
> > to the confusion ;)
> >
> > Feels like there will be potential race conditions where that matters as we might
> > force stale data to be written back.
> >
> > Perhaps a comment is enough for that. Anyone have the "famous last words" feeling?
>
> Is "invalidate" not clear that write-back is optional? Maybe not.

Yes, I'd say that "invalidate" means "dirty stuff may of may not have
been written back". Ditto for invalidate_inode_pages2().

> Also, I realized that we tried to include the address range to allow for
> the possibility of flushing by virtual address range, but that
> overcomplicates the use. I.e. if someone issue secure erase and the
> region association is not established does that mean that mean that the
> cache invalidation is not needed? It could be the case that someone
> disables a device, does the secure erase, and then reattaches to the
> same region. The cache invalidation is needed, but at the time of the
> secure erase the HPA was unknown.
>
> All this to say that I feel the bikeshedding will need to continue until
> morale improves.
>
> I notice that the DMA API uses 'sync' to indicate, "make this memory
> consistent/coherent for the CPU or the device", so how about an API like
>
> memregion_sync_for_cpu(int res_desc)
>
> ...where the @res_desc would be IORES_DESC_CXL for all CXL and
> IORES_DESC_PERSISTENT_MEMORY for the current nvdimm use case.

"sync" is another of my pet peeves ;) In filesystem land, at least.
Does it mean "start writeback and return" or does it mean "start
writeback, wait for its completion then return".