Re: [Lsf-pc] [LSF/MM] CXL Boot to Bash - Section 0: ACPI and Linux Resources
From: Dan Williams
Date: Mon Mar 31 2025 - 19:50:23 EST
Gregory Price wrote:
> On Thu, Mar 27, 2025 at 09:21:55AM -0400, Dan Williams wrote:
> > Gregory Price wrote:
> > > On Thu, Mar 27, 2025 at 05:34:54PM +0800, Yuquan Wang wrote:
> > > >
> > > > In the future, srat.c would add one seperate NUMA node for each
> > > > Generic Port in SRAT.
> > > >
> > > > System firmware should know the performance characteristics between
> > > > CPU/GI to the GP, and the static HMAT should include this coordinate.
> > > >
> > > > Is my understanding right?
> > > >
> > > >
> > >
> > > HMAT is static configuration data. A GI/GP might not have its
> > > performance data known until the device is added.
> >
> > The GP data is static and expected to be valid for all host bridges in
> > advance of any devices arriving.
> >
>
> Sorry, just shuffling words here for clarity. Making sure I understand:
>
> The GP data is static and enables Linux to do things like reserve numa
> nodes for any devices might arrive in the future (i.e. create static
> objects that cannot be created post-__init).
Small nuance, the CFMWS is what Linux uses to reserve numa nodes, the GP
data is there to dynamically craft the equivalent of HMAT data for those
nodes when the device shows up.
Recall that the CFMWS enumerates a QoS class for each CXL window. That
QoS class is decided by some (waves hands) coordination between host
platform and device vendors. So there is some, opaque to the OS,
decisions about which devices should be mapped in what window.
See "9.17.3.1 _DSM Function for Retrieving QTG ID" for that opaque
process.
Linux today just reports whether a device has any memory capacity that
matches any free-capacity-window QoS class, but does not enforce that
they must be compatible. This follows the assumption that it is better
to make capacity available than perfectly match performance
characteristics.
> If there's no device, there should not be any HMAT data.
...beyond GP data.
> If / when a device arrives, it's up to the OS to acquire that
> information from the device (e.g. CDAT). At this point the ACPI
> tables are not (shouldn't be) involved - it's all OS/device
> interactions.
>
> I should note that I don't have a full grasp of the GP ACPI stuff yet,
> so doing my best to grok it as I go here.
Again, no worries, this documentation is pulling this all together into
one story.