Re: [PATCH v1 1/2] KVM: arm64: determine memory type from VMA

From: Jason Gunthorpe
Date: Tue Oct 10 2023 - 15:39:55 EST


On Tue, Oct 10, 2023 at 06:19:14PM +0100, Catalin Marinas wrote:

> This has been fixed in newer architecture versions but we haven't
> added Linux support for it yet (and there's no hardware available
> either). AFAIK, there's no MTE support for CXL-attached memory at
> the moment in the current interconnects, so better not pretend it's
> general purpose memory that supports all the features.

I don't know much about MTE, but the use case imagined for CXL memory
allows the MM to exchange any system page with a CXL page. So there
cannot be a behavioral difference.

Can usespace continue to use tagged pointers even if the mm has moved
the pages to CXL that doesn't support it?

The main purpose for this is to tier VM memory, so having CXL
behaviorally different in a VM seems fatal to me.

Linux drivers need a way to understand this, like we can't have a CXL
memory pool driver or GPU driver calling memremap_pages() and getting
a somewhat broken system because of MTE incompatibilities. So maybe
ARM really should block memremap_pages() in case of MTE until
everything is resolved?

>From the mm perspective we can't have two kinds of cachable struct
pages running around that are functionally different.

> Other than preventing malicious guest behaviour, it depends what the VM
> needs cacheable access for: some GPU memory that's only used for sharing
> data and we don't need all features or general purpose memory that a VM
> can use to run applications from etc. The former may not need all the
> features (e.g. can skip exclusives) but the latter does.

Like CXL memory pooling GPU memory is used interchangably with every
kind of DDR memory in every context. It must be completely transparent
and interchangable via the mm's migration machinery.

> > > I've seen something similar in the past with
> > > LSE atomics (or was it exclusives?) not being propagated. These don't
> > > make the memory safe for a guest to use as general purpose RAM.
> >
> > At least from a mm perspective, I think it is important that cachable
> > struct pages are all the same and all interchangable. If the arch
> > cannot provide this it should not allow the pgmap/memremap to succeed
> > at all. Otherwise drivers using these new APIs are never going to work
> > fully right..
>
> Yes, for struct page backed memory, the current assumption is that all
> are the same, support all CPU features. It's the PFN-based memory where
> we don't have such guarantees.

I see it got a bit confused, I am talking about memremap_pages() (ie
include/linux/memremap.h), not memremap (include/linux/io.h) for
getting non-struct page memory. It is confusing :|

memremap_pages() is one of the entry points of the struct page hotplug
machinery. Things like CXL drivers assume they can hot plug in new
memory through these APIs and get new cachable struct pages that are
functionally identical to boot time cachable struct pages.

> We have an additional flag, VM_MTE_ALLOWED, only set for mappings backed
> by struct page. We could probe that in KVM and either fall back to
> non-cacheable or allow cacheable if MTE is disable at the vCPU level.

I'm not sure what this does, it is only set by shmem_map? That is
much stricter than "mappings backed by struct page"

Still, I'm not sure how to proceed here - we veered into this MTE
stuff I don't know we have experiance with yet.

Thanks,
Jason