Re: [PATCH v3 3/3] acpi,srat: give memory block size advice based on CFMWS alignment
From: Gregory Price
Date: Mon Oct 28 2024 - 16:55:14 EST
On Mon, Oct 28, 2024 at 07:24:54PM +0200, Mike Rapoport wrote:
> On Tue, Oct 22, 2024 at 05:34:50PM -0400, Gregory Price wrote:
> > Capacity is stranded when CFMWS regions are not aligned to block size.
> > On x86, block size increases with capacity (2G blocks @ 64G capacity).
> >
> > Use CFMWS base/size to report memory block size alignment advice.
> >
> > After the alignment, the acpi code begins populating numa nodes with
> > memblocks, so probe the value just prior to lock it in. All future
> > callers should be providing advice prior to this point.
> >
> > Suggested-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> > Signed-off-by: Gregory Price <gourry@xxxxxxxxxx>
> > ---
> > drivers/acpi/numa/srat.c | 33 +++++++++++++++++++++++++++++++++
> > 1 file changed, 33 insertions(+)
> >
... snip ...
> > + /* Align memblock size to CFMW regions if possible */
> > + acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, acpi_align_cfmws, NULL);
> > +
> > + /*
> > + * Nodes start populating with blocks after this, so probe the max
> > + * block size to prevent it from changing in the future.
> > + */
> > + memory_block_probe_max_size();
> > +
>
> It won't change, but how drivers/base/memory.c will know about the probed
> size if architecture does not override memory_block_size_bytes()?
>
non-arch code should be calling memory_block_size_bytes() to discover
the actual size of blocks - and for archs that care about this value,
that is when it should be probed. It's up to the arch whether/how to use
this information. Many archs ignore it entirely and use MIN_BLOCK_SIZE.
basically non-arch code shouldn't care what this value is, and even most
arch code shouldn't care.
I added this call to probe to lock in the size since I saw that nodes
will start populating blocks immediately after this.
Possibly the APIs should be marked __init so that the whole interface
disappears after init to avoid misuse post-init.
Possibly probe() should return -EBUSY if called more than once to
enforce a particular probe pattern on the architectures?
Open to thoughts here.
> > /* fake_pxm is the next unused PXM value after SRAT parsing */
> > for (i = 0, fake_pxm = -1; i < MAX_NUMNODES; i++) {
> > if (node_to_pxm_map[i] > fake_pxm)
> > --
> > 2.43.0
> >
>
> --
> Sincerely yours,
> Mike.