Re: [PATCH -next] memregion: Add arch_flush_memregion() interface

From: Davidlohr Bueso
Date: Tue Sep 13 2022 - 13:27:56 EST


On Fri, 09 Sep 2022, Jonathan Cameron wrote:

On Thu, 8 Sep 2022 16:22:26 -0700
Dan Williams <dan.j.williams@xxxxxxxxx> wrote:

Andrew Morton wrote:
> On Thu, 8 Sep 2022 15:51:50 -0700 Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>
> > Jonathan Cameron wrote:
> > > On Wed, 7 Sep 2022 18:07:31 -0700
> > > Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> > >
> > > > Andrew Morton wrote:
> > > > > I really dislike the term "flush". Sometimes it means writeback,
> > > > > sometimes it means invalidate. Perhaps at other times it means
> > > > > both.
> > > > >
> > > > > Can we please be very clear in comments and changelogs about exactly
> > > > > what this "flush" does. With bonus points for being more specific in the
> > > > > function naming?
> > > > >
> > > >
> > > > That's a good point, "flush" has been cargo-culted along in Linux's
> > > > cache management APIs to mean write-back-and-invalidate. In this case I
> > > > think this API is purely about invalidate. It just so happens that x86
> > > > has not historically had a global invalidate instruction readily
> > > > available which leads to the overuse of wbinvd.
> > > >
> > > > It would be nice to make clear that this API is purely about
> > > > invalidating any data cached for a physical address impacted by address
> > > > space management event (secure erase / new region provision). Write-back
> > > > is an unnecessary side-effect.
> > > >
> > > > So how about:
> > > >
> > > > s/arch_flush_memregion/cpu_cache_invalidate_memregion/?
> > >
> > > Want to indicate it 'might' write back perhaps?
> > > So could be invalidate or clean and invalidate (using arm ARM terms just to add
> > > to the confusion ;)
> > >
> > > Feels like there will be potential race conditions where that matters as we might
> > > force stale data to be written back.
> > >
> > > Perhaps a comment is enough for that. Anyone have the "famous last words" feeling?
> >
> > Is "invalidate" not clear that write-back is optional? Maybe not.
>
> Yes, I'd say that "invalidate" means "dirty stuff may of may not have
> been written back". Ditto for invalidate_inode_pages2().
>
> > Also, I realized that we tried to include the address range to allow for
> > the possibility of flushing by virtual address range, but that
> > overcomplicates the use. I.e. if someone issue secure erase and the
> > region association is not established does that mean that mean that the
> > cache invalidation is not needed? It could be the case that someone
> > disables a device, does the secure erase, and then reattaches to the
> > same region. The cache invalidation is needed, but at the time of the
> > secure erase the HPA was unknown.
> >
> > All this to say that I feel the bikeshedding will need to continue until
> > morale improves.
> >
> > I notice that the DMA API uses 'sync' to indicate, "make this memory
> > consistent/coherent for the CPU or the device", so how about an API like
> >
> > memregion_sync_for_cpu(int res_desc)
> >
> > ...where the @res_desc would be IORES_DESC_CXL for all CXL and
> > IORES_DESC_PERSISTENT_MEMORY for the current nvdimm use case.
>
> "sync" is another of my pet peeves ;) In filesystem land, at least.
> Does it mean "start writeback and return" or does it mean "start
> writeback, wait for its completion then return".

Ok, no "sync" :).

/**
* cpu_cache_invalidate_memregion - drop any CPU cached data for
* memregions described by @res_des
* @res_desc: one of the IORES_DESC_* types
*
* Perform cache maintenance after a memory event / operation that
* changes the contents of physical memory in a cache-incoherent manner.
* For example, memory-device secure erase, or provisioning new CXL
* regions. This routine may or may not write back any dirty contents
* while performing the invalidation.
*
* Returns 0 on success or negative error code on a failure to perform
* the cache maintenance.
*/
int cpu_cache_invalidate_memregion(int res_desc)

lgtm

Likewise, and I don't see anyone else objecting so I'll go ahead and send
a new iteration.

Thanks,
Davidlohr