Re: [PATCH v11 20/23] x86/resctrl: Configure mbm_cntr_assign mode if supported
From: James Morse
Date: Fri Feb 21 2025 - 13:09:21 EST
Hi Babu,
On 22/01/2025 20:20, Babu Moger wrote:
> Configure mbm_cntr_assign mode on AMD platforms. On AMD platforms, it
> is recommended to use mbm_cntr_assign mode if supported, because
> reading "mbm_total_bytes" or "mbm_local_bytes" will report 'Unavailable'
> if there is no counter associated with that event.
(If you agree with my comment on patch 7, it would be good to update this
wording to match.)
> The mbm_cntr_assign mode, referred to as ABMC (Assignable Bandwidth
> Monitoring Counters) on AMD, is enabled by default when supported by the
> system.
>
> Update ABMC across all logical processors within the resctrl domain to
> ensure proper functionality.
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index c006c4d8d6ff..2480698b643d 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -734,4 +734,5 @@ int resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d
> void mbm_cntr_reset(struct rdt_resource *r);
> int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
> struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
> +void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r);
> #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
Could this be put in include/linux/resctrl.h, its where it needs to end up eventually.
This sequence has me confused:
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 3d748fdbcb5f..a9a5dc626a1e 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -1233,6 +1233,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
> r->mon.mbm_cntr_assignable = true;
> cpuid_count(0x80000020, 5, &eax, &ebx, &ecx, &edx);
> r->mon.num_mbm_cntrs = (ebx & GENMASK(15, 0)) + 1;
> + hw_res->mbm_cntr_assign_enabled = true;
Here the arch code sets ABMC to be enabled by default at boot.
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 6922173c4f8f..515969c5f64f 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -4302,9 +4302,13 @@ int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d)
>
> void resctrl_online_cpu(unsigned int cpu)
> {
> + struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> +
> mutex_lock(&rdtgroup_mutex);
> /* The CPU is set in default rdtgroup after online. */
> cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask);
> + if (r->mon_capable && r->mon.mbm_cntr_assignable)
> + resctrl_arch_mbm_cntr_assign_set_one(r);
> mutex_unlock(&rdtgroup_mutex);
> }
But here, resctrl has to call back to the arch code to make sure the hardware is in the
same state as hw_res->mbm_cntr_assign_enabled.
Could this be done in resctrl_arch_online_cpu() instead? That way resctrl doesn't get CPUs
in an inconsistent state that it has to fix up...
Thanks,
James