Re: [LSF/MM] CXL Boot to Bash - Section 0a: CFMWS and NUMA Flexiblity

From: Gregory Price
Date: Thu Mar 13 2025 - 14:19:16 EST


On Thu, Mar 13, 2025 at 05:20:04PM +0000, Jonathan Cameron wrote:
> Gregory Price <gourry@xxxxxxxxxx> wrote:
>
> > -------------------------------
> > One 2GB Device, Multiple CFMWS.
> > -------------------------------
> > Lets imagine we have one 2GB device attached to a host bridge.
> >
> > In this example, the device hosts 2GB of persistent memory - but we
> > might want the flexibility to map capacity as volatile or persistent.
>
> Fairly sure we block persistent in a volatile CFMWS in the kernel.
> Any bios actually does this?
>
> You might have a variable partition device but I thought in kernel at
> least we decided that no one was building that crazy?
>

This was an example I pulled from Dan's notes elsewhere (i think).

I was unaware that we blocked mapping persistent as volatile. I was
working off the assumption that could be flexible mapped similar to...
er... older, non-cxl hardware... cough.

> Maybe a QoS split is a better example to motivate one range, two places?
>

That probably makes sense?

> > -------------------------------------------------------------
> > Two Devices On One Host Bridge - With and Without Interleave.
> > -------------------------------------------------------------
> > What if we wanted some capacity on each endpoint hosted on its own NUMA
> > node, and wanted to interleave a portion of each device capacity?
>
> If anyone hits the lock on commit (i.e. annoying BIOS) the ordering
> checks on HPA kick in here and restrict flexibility a lot
> (assuming I understand them correctly that is)
>
> This is a good illustration of why we should at some point revisit
> multiple NUMA nodes per CFMWS. We have to burn SPA space just
> to get nodes. From a spec point of view all that is needed here
> is a single CFMWS.
>

Along with the above note, and as mentioned on discord, I think this
whole section naturally evolves into a library of "Sane configurations"
and "We promise nothing for `reasons`" configurations.

Maybe that turns into a kernel doc section that requires updating if
a platform disagrees / comes up with new sane configurations. This is
certainly the most difficult area to lock down because we have no idea
who is going to `innovate` and how.

~Gregory