Re: [PATCH] acpi/hmat,mm/memtier: always register hmat adist calculation callback

From: Huang, Ying
Date: Sun Jul 28 2024 - 21:06:19 EST


Gregory Price <gourry@xxxxxxxxxx> writes:

> In the event that hmat data is not available for the DRAM tier,
> or if it is invalid (bandwidth or latency is 0), we can still register
> a callback to calculate the abstract distance for non-cpu nodes
> and simply assign it a different tier manually.
>
> In the case where DRAM HMAT values are missing or not sane we
> manually assign adist=(MEMTIER_ADISTANCE_DRAM + MEMTIER_CHUNK_SIZE).
>
> If the HMAT data for the non-cpu tier is invalid (e.g. bw = 0), we
> cannot reasonable determine where to place the tier, so it will default
> to MEMTIER_ADISTANCE_DRAM (which is the existing behavior).

Why do we need this? Do you have machines with broken HMAT table? Can
you ask the vendor to fix the HMAT table?

--
Best Regards,
Huang, Ying

> Signed-off-by: Gregory Price <gourry@xxxxxxxxxx>
> ---
> drivers/acpi/numa/hmat.c | 6 ++++--
> mm/memory-tiers.c | 10 ++++++++--
> 2 files changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
> index 2c8ccc91ebe6..1642d2bd83b5 100644
> --- a/drivers/acpi/numa/hmat.c
> +++ b/drivers/acpi/numa/hmat.c
> @@ -1080,8 +1080,10 @@ static __init int hmat_init(void)
> if (hotplug_memory_notifier(hmat_callback, HMAT_CALLBACK_PRI))
> goto out_put;
>
> - if (!hmat_set_default_dram_perf())
> - register_mt_adistance_algorithm(&hmat_adist_nb);
> + if (hmat_set_default_dram_perf())
> + pr_notice("Failed to set default dram perf\n");
> +
> + register_mt_adistance_algorithm(&hmat_adist_nb);
>
> return 0;
> out_put:
> diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c
> index 6632102bd5c9..43bd508938ae 100644
> --- a/mm/memory-tiers.c
> +++ b/mm/memory-tiers.c
> @@ -765,8 +765,14 @@ int mt_perf_to_adistance(struct access_coordinate *perf, int *adist)
> perf->read_bandwidth + perf->write_bandwidth == 0)
> return -EINVAL;
>
> - if (default_dram_perf_ref_nid == NUMA_NO_NODE)
> - return -ENOENT;
> + /*
> + * If the DRAM tier did not have valid HMAT data, we can instead just
> + * assume that the non-cpu numa nodes are 1 tier below cpu nodes
> + */
> + if (default_dram_perf_ref_nid == NUMA_NO_NODE) {
> + *adist = MEMTIER_ADISTANCE_DRAM + MEMTIER_CHUNK_SIZE;
> + return 0;
> + }
>
> /*
> * The abstract distance of a memory node is in direct proportion to