Re: [PATCH 3/3] acpi,srat: reduce memory block size if CFMWS has a smaller alignment
From: Dan Williams
Date: Tue Oct 08 2024 - 12:47:00 EST
Gregory Price wrote:
> On Tue, Oct 08, 2024 at 09:58:35AM -0500, Ira Weiny wrote:
> > Gregory Price wrote:
> > > The CXL Fixed Memory Window allows for memory aligned down to the
> > > size of 256MB. However, by default on x86, memory blocks increase
> > > in size as total System RAM capacity increases. On x86, this caps
> > > out at 2G when 64GB of System RAM is reached.
> > >
> > > When the CFMWS regions are not aligned to memory block size, this
> > > results in lost capacity on either side of the alignment.
> > >
> > > Parse all CFMWS to detect the largest common denomenator among all
> > > regions, and reduce the block size accordingly.
> > >
> > > This can only be done when MEMORY_HOTPLUG and SPARSEMEM configs are
> > > enabled, but the surrounding code may not necessarily require these
> > > configs, so build accordingly.
> > >
> > > Suggested-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> > > Signed-off-by: Gregory Price <gourry@xxxxxxxxxx>
> > > ---
[..]
> > To help address David's comment as well;
> >
> > Is there a way to scan all the alignments of the windows and pass the
> > desired alignment to the arch in a new call and have the arch determine if
> > changing the order is ok?
> >
>
> At least on x86, it's only OK during init, so it would probably look like
> setting a static bit (like the global value in x86) and just refusing to
> update once it is locked.
>
> I could implement that on the x86 side as an example.
>
> FWIW: this was Dan's suggestion (quoting discord, sorry Dan!)
> ```
> I am assuming we would call it here
> drivers/acpi/numa/srat.c::acpi_parse_cfmws()
> which should be before page-allocator init
> ```
>
> It's only safe before page-allocator init (i.e. once blocks start getting
> populated and used), and this area occurs before that.
I will note though that drivers/acpi/numa/srat.c is always built-in, so
there is no need for set_memory_block_size_order() to be EXPORT_SYMBOL
for modules to play with, just an extern for NUMA init to access.