Re: [PATCH v5 38/41] arm_mpam: Add workaround for T241-MPAM-4
From: Fenghua Yu
Date: Mon Mar 09 2026 - 13:40:09 EST
Hi, Ben,
On 3/2/26 09:11, Ben Horgan wrote:
Hi Fenghua,
On 3/1/26 17:28, Fenghua Yu wrote:
Hi, Ben,
On 2/24/26 09:57, Ben Horgan wrote:
From: Shanker Donthineni <sdonthineni@xxxxxxxxxx>
In the T241 implementation of memory-bandwidth partitioning, in the
absence
of contention for bandwidth, the minimum bandwidth setting can affect the
amount of achieved bandwidth. Specifically, the achieved bandwidth in the
absence of contention can settle to any value between the values of
MPAMCFG_MBW_MIN and MPAMCFG_MBW_MAX. Also, if MPAMCFG_MBW_MIN is set
zero (below 0.78125%), once a core enters a throttled state, it will
never
leave that state.
The first issue is not a concern if the MPAM software allows to program
MPAMCFG_MBW_MIN through the sysfs interface. This patch ensures program
MBW_MIN=1 (0.78125%) whenever MPAMCFG_MBW_MIN=0 is programmed.
In the scenario where the resctrl doesn't support the MBW_MIN
interface via
sysfs, to achieve bandwidth closer to MBW_MAX in the absence of
contention,
software should configure a relatively narrow gap between MBW_MIN and
MBW_MAX. The recommendation is to use a 5% gap to mitigate the problem.
Clear the feature MBW_MIN feature from the class to ensure we don't
accidentally change behaviour when resctrl adds support for a MBW_MIN
interface.
Tested-by: Gavin Shan <gshan@xxxxxxxxxx>
Tested-by: Shaopeng Tan <tan.shaopeng@xxxxxxxxxxxxxx>
Reviewed-by: Shaopeng Tan <tan.shaopeng@xxxxxxxxxxxxxx>
Signed-off-by: Shanker Donthineni <sdonthineni@xxxxxxxxxx>
Signed-off-by: James Morse <james.morse@xxxxxxx>
Signed-off-by: Ben Horgan <ben.horgan@xxxxxxx>
Reviewed-by: Fenghua Yu <fenghuay@xxxxxxxxxx>
This patch itself is good.
Please check the following comments.
---
[ morse: Added as second quirk, adapted to use the new intermediate
values
in mpam_extend_config() ]
Changes since rfc:
MPAM_IIDR_NVIDIA_T421 -> MPAM_IIDR_NVIDIA_T241
Handling when reset_mbw_min is set
Changes since v3:
Move the 5% gap policy back here
Clear mbw_min feature in class
---
Documentation/arch/arm64/silicon-errata.rst | 2 +
drivers/resctrl/mpam_devices.c | 50 +++++++++++++++++++--
drivers/resctrl/mpam_internal.h | 1 +
3 files changed, 50 insertions(+), 3 deletions(-)
diff --git a/Documentation/arch/arm64/silicon-errata.rst b/
Documentation/arch/arm64/silicon-errata.rst
index a65620f98e3a..a4b246655e37 100644
--- a/Documentation/arch/arm64/silicon-errata.rst
+++ b/Documentation/arch/arm64/silicon-errata.rst
@@ -249,6 +249,8 @@ stable kernels.
+----------------+-----------------+-----------------
+-----------------------------+
| NVIDIA | T241 MPAM | T241-MPAM-1 | N/
A |
+----------------+-----------------+-----------------
+-----------------------------+
+| NVIDIA | T241 MPAM | T241-MPAM-4 | N/
A |
++----------------+-----------------+-----------------
+-----------------------------+
+----------------+-----------------+-----------------
+-----------------------------+
| Freescale/NXP | LS2080A/LS1043A | A-008585 |
FSL_ERRATUM_A008585 |
+----------------+-----------------+-----------------
+-----------------------------+
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/
mpam_devices.c
index 08cb080592d9..8f44e9dee207 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -679,6 +679,12 @@ static const struct mpam_quirk mpam_quirks[] = {
.iidr_mask = MPAM_IIDR_MATCH_ONE,
.workaround = T241_SCRUB_SHADOW_REGS,
},
+ {
+ /* NVIDIA t241 erratum T241-MPAM-4 */
+ .iidr = MPAM_IIDR_NVIDIA_T241,
+ .iidr_mask = MPAM_IIDR_MATCH_ONE,
+ .workaround = T241_FORCE_MBW_MIN_TO_ONE,
+ },
{ NULL } /* Sentinel */
};
@@ -1464,6 +1470,31 @@ static void
mpam_quirk_post_config_change(struct mpam_msc_ris *ris, u16 partid,
mpam_apply_t241_erratum(ris, partid);
}
+static u16 mpam_wa_t241_force_mbw_min_to_one(struct mpam_props *props)
+{
+ u16 max_hw_value, min_hw_granule, res0_bits;
+
+ res0_bits = 16 - props->bwa_wd;
+ max_hw_value = ((1 << props->bwa_wd) - 1) << res0_bits;
+ min_hw_granule = ~max_hw_value;
+
+ return min_hw_granule + 1;
+}
+
+static u16 mpam_wa_t241_calc_min_from_max(struct mpam_config *cfg)
+{
+ u16 val = 0;
+
+ if (mpam_has_feature(mpam_feat_mbw_max, cfg)) {
But the problem is mpam_feat_mbw_max feature is NOT set in cfg.
+ u16 delta = ((5 * MPAMCFG_MBW_MAX_MAX) / 100) - 1;
+
+ if (cfg->mbw_max > delta)
+ val = cfg->mbw_max - delta;
+ }
+
+ return val;
So 0 is always returned.
The workaround will set mbw_min as 1% which is too small and will cause
performance degradation, e.g. about 20% degradation on some benchmarks.
This patch itself doesn't have any issue.
The issue is the mbw_max feature bit in cfg is not set.
This is intended behaviour as the reset is done independently
from the value set in the config. The value is there so that
resctrl can display the expected values.
This is a legacy issue, not introduced by this patch set.
Here is a fix patch for the issue:https://lore.kernel.org/lkml/20260301171829.1357886-1-
fenghuay@xxxxxxxxxx/T/#u
I've commented on that patch. I think it's best to fix it in the context
of the erratum.
Does the below solve your performance problems?
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 236f78ab9163..60d3d3e2193f 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -1515,16 +1515,20 @@ static u16 mpam_wa_t241_force_mbw_min_to_one(struct mpam_props *props)
return min_hw_granule + 1;
}
-static u16 mpam_wa_t241_calc_min_from_max(struct mpam_config *cfg)
+static u16 mpam_wa_t241_calc_min_from_max(struct mpam_props *props,
+ struct mpam_config *cfg)
{
u16 val = 0;
+ u16 max;
+ u16 delta = ((5 * MPAMCFG_MBW_MAX_MAX) / 100) - 1;
- if (mpam_has_feature(mpam_feat_mbw_max, cfg)) {
- u16 delta = ((5 * MPAMCFG_MBW_MAX_MAX) / 100) - 1;
+ if (mpam_has_feature(mpam_feat_mbw_max, cfg))
+ max = cfg->mbw_max;
+ else
+ max = GENMASK(15, 16 - cprops->bwa_wd);
- if (cfg->mbw_max > delta)
- val = cfg->mbw_max - delta;
- }
Could you please add some comments on this piece of code? It's worth to comment on why there are different values on cfg and props.
+ if (max > delta)
+ val = max - delta;
return val;
}
@@ -1577,9 +1581,8 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
if (mpam_has_quirk(T241_FORCE_MBW_MIN_TO_ONE, msc)) {
u16 min = mpam_wa_t241_force_mbw_min_to_one(rprops);
- val = mpam_wa_t241_calc_min_from_max(cfg);
- if (val < min)
- val = min;
+ val = mpam_wa_t241_calc_min_from_max(rprops, cfg);
+ val = max(val, min);
}
mpam_write_partsel_reg(msc, MBW_MIN, val);
Otherwise, this change looks good to me.
Thanks.
-Fenghua