Re: [PATCH 3/3] acpi,srat: reduce memory block size if CFMWS has a smaller alignment

From: David Hildenbrand
Date: Mon Oct 14 2024 - 07:53:28 EST


On 08.10.24 18:46, Dan Williams wrote:
Gregory Price wrote:
On Tue, Oct 08, 2024 at 09:58:35AM -0500, Ira Weiny wrote:
Gregory Price wrote:
The CXL Fixed Memory Window allows for memory aligned down to the
size of 256MB. However, by default on x86, memory blocks increase
in size as total System RAM capacity increases. On x86, this caps
out at 2G when 64GB of System RAM is reached.

When the CFMWS regions are not aligned to memory block size, this
results in lost capacity on either side of the alignment.

Parse all CFMWS to detect the largest common denomenator among all
regions, and reduce the block size accordingly.

This can only be done when MEMORY_HOTPLUG and SPARSEMEM configs are
enabled, but the surrounding code may not necessarily require these
configs, so build accordingly.

Suggested-by: Dan Williams <dan.j.williams@xxxxxxxxx>
Signed-off-by: Gregory Price <gourry@xxxxxxxxxx>
---
[..]
To help address David's comment as well;

Is there a way to scan all the alignments of the windows and pass the
desired alignment to the arch in a new call and have the arch determine if
changing the order is ok?


At least on x86, it's only OK during init, so it would probably look like
setting a static bit (like the global value in x86) and just refusing to
update once it is locked.

I could implement that on the x86 side as an example.

FWIW: this was Dan's suggestion (quoting discord, sorry Dan!)
```
I am assuming we would call it here
drivers/acpi/numa/srat.c::acpi_parse_cfmws()
which should be before page-allocator init
```

It's only safe before page-allocator init (i.e. once blocks start getting
populated and used), and this area occurs before that.

Sorry for the late reply. It must also be called before memory_dev_init(), which happens after the buddy is up IIRC.


I will note though that drivers/acpi/numa/srat.c is always built-in, so
there is no need for set_memory_block_size_order() to be EXPORT_SYMBOL
for modules to play with, just an extern for NUMA init to access.

That's the magic piece I was missing.

Because it didn't make too much sense for me a call to this function would ever makes sense after modules where loaded.

This really only works that early during boot, that modules are not loaded yet.

So this patch here should be dropped.

--
Cheers,

David / dhildenb