Re: [RFC PATCH] resource: Fix CXL node not populated issue

From: Dan Williams
Date: Tue Dec 03 2024 - 22:55:50 EST


[ add regressions@xxxxxxxxxxxxxxx ]

Next time make the subject of the patch:

Revert "resource: fix region_intersects() vs add_memory_driver_managed()"

...to make it clear that this is a revert, not a fix.

The revert should be applied if a fix does not materialize in the next few weeks.

Raghavendra K T wrote:
> Before:
> ~]$ numastat -m
> ...
> Node 0 Node 1 Total
> --------------- --------------- ---------------
> MemTotal 128096.18 128838.48 256934.65
>
> After:
> $ numastat -m
> .....
> Node 0 Node 1 Node 2 Total
> --------------- --------------- --------------- ---------------
> MemTotal 128054.16 128880.51 129024.00 385958.67
>
> Current patch reverts the effect of first commit where the issue is seen.

Might you be able to dig a bit further into the details like memory map
for this platform and ACPI SRAT tables? A dmesg comparison of the good
and bad cases would be useful (those can be shared via a github gist).
Even better would be some debug instrumentation to identify which call
to __region_intersects() started behaving differently resulting in a
whole node disappearing.

In terms of the urgency of fixing this it would also help to know how
prevalent the system this was found on is in the wild.