Re: [PATCH] x86/resctrl: Fix memory bandwidth counter width for AMD

From: Reinette Chatre
Date: Tue Jun 02 2020 - 17:51:28 EST


Hi Babu,

On 6/1/2020 4:00 PM, Babu Moger wrote:
> Memory bandwidth is calculated reading the monitoring counter
> at two intervals and calculating the delta. It is the softwareâs
> responsibility to read the count often enough to avoid having
> the count roll over _twice_ between reads.
>
> The current code hardcodes the bandwidth monitoring counter's width
> to 24 bits for AMD. This is due to default base counter width which
> is 24. Currently, AMD does not implement the CPUID 0xF.[ECX=1]:EAX
> to adjust the counter width. But, the AMD hardware supports much
> wider bandwidth counter with the default width of 44 bits.
>
> Kernel reads these monitoring counters every 1 second and adjusts the
> counter value for overflow. With 24 bits and scale value of 64 for AMD,
> it can only measure up to 1GB/s without overflowing. For the rates
> above 1GB/s this will fail to measure the bandwidth.
>
> Fix the issue setting the default width to 44 bits by adjusting the
> offset.
>
> AMD future products will implement the CPUID 0xF.[ECX=1]:EAX.
>
> Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
> ---
> - Sending it second time. Email client had some issues first time.
> - Generated the patch on top of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git (x86/cache).
>
> arch/x86/kernel/cpu/resctrl/core.c | 8 +++++++-
> arch/x86/kernel/cpu/resctrl/internal.h | 1 +
> 2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 12f967c6b603..6040e9ae541b 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -983,7 +983,13 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c)
> c->x86_cache_occ_scale = ebx;
> if (c->x86_vendor == X86_VENDOR_INTEL)
> c->x86_cache_mbm_width_offset = eax & 0xff;
> - else
> + else if (c->x86_vendor == X86_VENDOR_AMD) {
> + if (eax)

This test checks if _any_ bit is set in eax ...

> + c->x86_cache_mbm_width_offset = eax & 0xff;

... with the assumption that the first eight bits contain a value.

Even so, now that Intel and AMD will be using eax in the same way,
perhaps it can be done simpler by always using eax to obtain the offset
(and thus avoid the code duplication) and on AMD initialize the default
if it cannot be obtained from eax?

What I mean is something like:

@@ -1024,10 +1024,12 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c)

c->x86_cache_max_rmid = ecx;
c->x86_cache_occ_scale = ebx;
- if (c->x86_vendor == X86_VENDOR_INTEL)
- c->x86_cache_mbm_width_offset = eax & 0xff;
- else
- c->x86_cache_mbm_width_offset = -1;
+ c->x86_cache_mbm_width_offset = eax & 0xff;
+ if (c->x86_vendor == X86_VENDOR_AMD &&
+ c->x86_cache_mbm_width_offset == 0) {
+ c->x86_cache_mbm_width_offset =
+ MBM_CNTR_WIDTH_OFFSET_AMD;
+ }
}
}

What do you think?

Reinette