Re: [PATCH] fs/resctrl,x86/resctrl: Factor mba rounding to be per-arch
From: Chen, Yu C
Date: Mon Sep 29 2025 - 05:19:52 EST
On 9/26/2025 6:58 AM, Luck, Tony wrote:
On Mon, Sep 22, 2025 at 04:04:40PM +0100, Dave Martin wrote:
Hi again,
On Fri, Sep 12, 2025 at 03:19:04PM -0700, Reinette Chatre wrote:
[...]
[snip]
For example, userspace might write the following:
MB_MIN: 0=16, 1=16
MB_MAX: 0=32, 1=32
Which might then read back as follows:
MB: 0=50, 1=50
# MB_HW: 0=32, 1=32
# MB_MIN: 0=16, 1=16
# MB_MAX: 0=32, 1=32
I haven't tried to develop this idea further, for now.
I'd be interested in people's thoughts on it, though.
Applying this to Intel upcoming region aware memory bandwidth
that supports 255 steps and h/w min/max limits.
We would have info files with "min = 1, max = 255" and a schemata
file that looks like this to legacy apps:
MB: 0=50;1=75
#MB_HW: 0=128;1=191
#MB_MIN: 0=128;1=191
#MB_MAX: 0=128;1=191
But a newer app that is aware of the extensions can write:
# cat > schemata << 'EOF'
MB_HW: 0=10
MB_MIN: 0=10
MB_MAX: 0=64
EOF
which then reads back as:
MB: 0=4;1=75
#MB_HW: 0=10;1=191
#MB_MIN: 0=10;1=191
#MB_MAX: 0=64;1=191
with the legacy line updated with the rounded value of the MB_HW
supplied by the user. 10/255 = 3.921% ... so call it "4".
This seems to be applicable as it introduces the new interface
while preserving forward compatibility.
One minor question is that, according to "Figure 6-5. MBA Optimal
Bandwidth Register" in the latest RDT specification, the maximum
value ranges from 1 to 511.
Additionally, this bandwidth field is located at bits 48 to 56 in
the MBA Optimal Bandwidth Register, and the range for
this segment could be 1 to 8191. Just wonder if it would be
possible that the current maximum value of 512 may be extended
in the future? Perhaps we could explore a method to query the maximum upper limit from the ACPI table or register, or use CPUID to distinguish between platforms rather than hardcoding it. Reinette also mentioned this in another thread.
Thanks,
Chenyu
[1] https://www.intel.com/content/www/us/en/content-details/851356/intel-resource-director-technology-intel-rdt-architecture-specification.html
The region aware h/w supports separate bandwidth controls for each
region. We could hope (or perhaps update the spec to define) that
region0 is always node-local DDR memory and keep the "MB" tag for
that.
Then use some other tag naming for other regions. Remote DDR,
local CXL, remote CXL are the ones we think are next in the h/w
memory sequence. But the "region" concept would allow for other
options as other memory technologies come into use.
Cheers
---Dave