Re: [PATCH 2/2 v6] mm/mempolicy: Don't create weight sysfs for memoryless nodes

From: Gregory Price
Date: Tue Mar 18 2025 - 11:13:49 EST


On Tue, Mar 18, 2025 at 08:02:46PM +0900, Honggyu Kim wrote:
>
>
> On 3/18/2025 5:02 PM, Yunjeong Mun wrote:
>
> Some simple corrections here. host-bridge{0-3} above aren't detected from CEDT.
> The corrected structure is as follows.
>
> rootport/
> ├── socket0
> │ ├── cross-host-bridge0 -> SRAT && CEDT (interleave on) --> NODE 2
> │ │ ├── host-bridge0
> │ │ │ ├── cxl0 -> CEDT
node 4
> │ │ │ └── cxl1-> CEDT
node 5
> │ │ └── host-bridge1
> │ │ ├── cxl2 -> CEDT
node 6
> │ │ └── cxl3 -> CEDT
node 7
> │ └── dram0 -> SRAT ---------------------------------------> NODE 0
> └── socket1
> ├── cross-host-bridge1 -> SRAT && CEDT (interleave on)---> NODE 3
> │ ├── host-bridge2
> │ │ ├── cxl4 -> CEDT
node 8
> │ │ └── cxl5 -> CEDT
node 9
> │ └── host-bridge3
> │ ├── cxl6 -> CEDT
node 10
> │ └── cxl7 -> CEDT
node 11
> └── dram1 -> SRAT ---------------------------------------> NODE 1
>

This is correct and expected.

All of these nodes are "possible" depending on how the user decides to
program the CXL decoders and expose memory to the page allocator.

In your /sys/bus/cxl/devices/ you should have something like

decoder0.0 decoder0.1 decoder0.2 decoder0.3
decoder0.4 decoder0.5 decoder0.6 decoder0.7
decoder0.8 decoder0.9

These are the root decoders that should map up directly with each CEDT
CFMWS entry.

2 of them should have interleave settings.

If you were to then program the endpoint and hostbridge decoders with
the matching non-interleave address values from the other CEDT entries,
you could bring each individual device online in its own NUMA node.

Or, you can do what you're doing now, and program the endpoints to map
to the 2 cross-host bridge interleave root decoders.

So your platform is giving you the option of how to online your devices,
and as such it needs to mark nodes as "possible" even if they're unused.

~Gregory