Re: [PATCH] tdx, memory hotplug: Check whole hot-adding memory range for TDX

From: Dan Williams
Date: Thu Oct 10 2024 - 14:55:20 EST


James Morse wrote:
[..]
> > Yeah, it should be a good way to let the kernel know whether CXL
> > supports memory tagging or not.
>
> On its own I don't think its enough - there would need to be some kind of capability in
> both the CXL root-port and the device to say that MTE tags are sent in that metadata
> field. If both support it, then the device memory supports MTE.
>
> (I'll poke the standards people to see if this is something they already have in the
> works...)

If it helps, the question I would ask is "will the ACPI CFMWS (CXL Fixed
Memory Window Structure), grow a new 'Window Restrictions' bit
indicating the presence of EMD support, or will it be left to an ARM
specific enumeration outside of CFMWS?".

> >>>> However, why would it be ok to access CXL memory without MTE via devdax,
> >>>> but not as online page allocator memory?
>
> >>> CXL memory can be onlined as system ram as long as MTE is not enabled.
> >>> It just can be used as devdax device if MTE is enabled.
>
> This makes sense to me.
>
> We can print a warning that 'arm64.nomte' should be passed on the command line if the CXL
> memory is more important than MTE and the hardware can't support both.
>
>
> >> Do you mean the kernel only manages MTE for kernel pages, but with user
> >> mapped memory the application will need to implicitly know that
> >> memory-tagging is not available?
> >
> > I think the current assumption is that all buddy memory (can be used
> > by userspace) should be taggable. And memory tagging is only supported
> > for anonymous mapping and tmpfs. I'm adding hugetlbfs support. But any
> > memory backed by the real backing store doesn't have memory tagging
> > support.
>
> Hopefully there are no assumptions here! -
> Documentation/arch/arm64/memory-tagging-extension.rst says anonymous mappings can have
> PROT_MTE set.
>
> The arch code requires all memory to support MTE if the CPUs support it.
>
>
> >> I worry about applications that might not know that their heap is coming
> >> from a userspace memory allocator backed by device-dax rather than the
> >> kernel.
> >
> > IIUC, memory mapping from device-dax is a file mapping, right? If so,
> > it is safe. If it is not, I think it is easy to handle. We can just
> > reject any VM_MTE mapping from DAX.
>
> That should already be the case. (we should check!)
>
> Because devdax is already a file-mapping, user-space can't expect MTE to work.
> While some library may not know the memory came from devdax - whoever wrote the
> malloc()/free() implementation will have known they were using devdax - this is where the
> decisions to use MTE and what tag to use is made.
>
> I don't think this adds a new broken case.

Yeah, makes sense.

> >>>> If the goal is to simply deny any and all non-MTE supported CXL region
> >>>> from attaching then that could probably be handled as a modification to
> >>>> the "cxl_acpi" driver to deny region creation unless it supports
> >>>> everything the CPU expects from "memory".
> >>>
> >>> I'm not quite familiar with the details in CXL driver. What did you
> >>> mean "deny region creation"? As long as the CXL memory still can be
> >>> used as devdax device, it should be fine.
> >>
> >> Meaning that the CXL subsytem knows how to, for a given address range, figure
> >> out the members and geometry of the CXL devices that contribute to that
> >> range (CXL region). It would be straightforward to add EMD to that
> >> enumeration and flag the CXL region as not online-capable if the CPU has
> >> MTE enabled but no EMD capability.
> >
> > It sounds like a good way to me.
>
> From your earlier description, EMD may not be enough - and this would depend on the
> root-port (or at least the host side decoders) to support this too. I'll poke the spec
> people...

About the best CXL could do is indicate that the CXL window supports
EMD, but that is not sufficient for determining the arch capability for
MTE, so something tells me this might end up being an ARM specific (ACPI
or otherwise) enumeration to flag which if any CXL windows support MTE
regardless of EMD support.