Re: [PATCH v5 38/41] arm_mpam: Add workaround for T241-MPAM-4

From: Ben Horgan

Date: Mon Mar 02 2026 - 12:18:43 EST


Hi Fenghua,

On 3/1/26 17:28, Fenghua Yu wrote:
> Hi, Ben,
>
> On 2/24/26 09:57, Ben Horgan wrote:
>> From: Shanker Donthineni <sdonthineni@xxxxxxxxxx>
>>
>> In the T241 implementation of memory-bandwidth partitioning, in the
>> absence
>> of contention for bandwidth, the minimum bandwidth setting can affect the
>> amount of achieved bandwidth. Specifically, the achieved bandwidth in the
>> absence of contention can settle to any value between the values of
>> MPAMCFG_MBW_MIN and MPAMCFG_MBW_MAX.  Also, if MPAMCFG_MBW_MIN is set
>> zero (below 0.78125%), once a core enters a throttled state, it will
>> never
>> leave that state.
>>
>> The first issue is not a concern if the MPAM software allows to program
>> MPAMCFG_MBW_MIN through the sysfs interface. This patch ensures program
>> MBW_MIN=1 (0.78125%) whenever MPAMCFG_MBW_MIN=0 is programmed.
>>
>> In the scenario where the resctrl doesn't support the MBW_MIN
>> interface via
>> sysfs, to achieve bandwidth closer to MBW_MAX in the absence of
>> contention,
>> software should configure a relatively narrow gap between MBW_MIN and
>> MBW_MAX. The recommendation is to use a 5% gap to mitigate the problem.
>>
>> Clear the feature MBW_MIN feature from the class to ensure we don't
>> accidentally change behaviour when resctrl adds support for a MBW_MIN
>> interface.
>>
>> Tested-by: Gavin Shan <gshan@xxxxxxxxxx>
>> Tested-by: Shaopeng Tan <tan.shaopeng@xxxxxxxxxxxxxx>
>> Reviewed-by: Shaopeng Tan <tan.shaopeng@xxxxxxxxxxxxxx>
>> Signed-off-by: Shanker Donthineni <sdonthineni@xxxxxxxxxx>
>> Signed-off-by: James Morse <james.morse@xxxxxxx>
>> Signed-off-by: Ben Horgan <ben.horgan@xxxxxxx>
>
> Reviewed-by: Fenghua Yu <fenghuay@xxxxxxxxxx>
>
> This patch itself is good.
>
> Please check the following comments.
>
>> ---
>> [ morse: Added as second quirk, adapted to use the new intermediate
>> values
>> in mpam_extend_config() ]
>>
>> Changes since rfc:
>> MPAM_IIDR_NVIDIA_T421 -> MPAM_IIDR_NVIDIA_T241
>> Handling when reset_mbw_min is set
>>
>> Changes since v3:
>> Move the 5% gap policy back here
>> Clear mbw_min feature in class
>> ---
>>   Documentation/arch/arm64/silicon-errata.rst |  2 +
>>   drivers/resctrl/mpam_devices.c              | 50 +++++++++++++++++++--
>>   drivers/resctrl/mpam_internal.h             |  1 +
>>   3 files changed, 50 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/arch/arm64/silicon-errata.rst b/
>> Documentation/arch/arm64/silicon-errata.rst
>> index a65620f98e3a..a4b246655e37 100644
>> --- a/Documentation/arch/arm64/silicon-errata.rst
>> +++ b/Documentation/arch/arm64/silicon-errata.rst
>> @@ -249,6 +249,8 @@ stable kernels.
>>   +----------------+-----------------+-----------------
>> +-----------------------------+
>>   | NVIDIA         | T241 MPAM       | T241-MPAM-1     | N/
>> A                         |
>>   +----------------+-----------------+-----------------
>> +-----------------------------+
>> +| NVIDIA         | T241 MPAM       | T241-MPAM-4     | N/
>> A                         |
>> ++----------------+-----------------+-----------------
>> +-----------------------------+
>>   +----------------+-----------------+-----------------
>> +-----------------------------+
>>   | Freescale/NXP  | LS2080A/LS1043A | A-008585        |
>> FSL_ERRATUM_A008585         |
>>   +----------------+-----------------+-----------------
>> +-----------------------------+
>> diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/
>> mpam_devices.c
>> index 08cb080592d9..8f44e9dee207 100644
>> --- a/drivers/resctrl/mpam_devices.c
>> +++ b/drivers/resctrl/mpam_devices.c
>> @@ -679,6 +679,12 @@ static const struct mpam_quirk mpam_quirks[] = {
>>       .iidr_mask  = MPAM_IIDR_MATCH_ONE,
>>       .workaround = T241_SCRUB_SHADOW_REGS,
>>       },
>> +    {
>> +    /* NVIDIA t241 erratum T241-MPAM-4 */
>> +    .iidr       = MPAM_IIDR_NVIDIA_T241,
>> +    .iidr_mask  = MPAM_IIDR_MATCH_ONE,
>> +    .workaround = T241_FORCE_MBW_MIN_TO_ONE,
>> +    },
>>       { NULL } /* Sentinel */
>>   };
>>   @@ -1464,6 +1470,31 @@ static void
>> mpam_quirk_post_config_change(struct mpam_msc_ris *ris, u16 partid,
>>           mpam_apply_t241_erratum(ris, partid);
>>   }
>>   +static u16 mpam_wa_t241_force_mbw_min_to_one(struct mpam_props *props)
>> +{
>> +    u16 max_hw_value, min_hw_granule, res0_bits;
>> +
>> +    res0_bits = 16 - props->bwa_wd;
>> +    max_hw_value = ((1 << props->bwa_wd) - 1) << res0_bits;
>> +    min_hw_granule = ~max_hw_value;
>> +
>> +    return min_hw_granule + 1;
>> +}
>> +
>> +static u16 mpam_wa_t241_calc_min_from_max(struct mpam_config *cfg)
>> +{
>> +    u16 val = 0;
>> +
>> +    if (mpam_has_feature(mpam_feat_mbw_max, cfg)) {
>
> But the problem is mpam_feat_mbw_max feature is NOT set in cfg.
>
>> +        u16 delta = ((5 * MPAMCFG_MBW_MAX_MAX) / 100) - 1;
>> +
>> +        if (cfg->mbw_max > delta)
>> +            val = cfg->mbw_max - delta;
>> +    }
>> +
>> +    return val;
>
> So 0 is always returned.
>
> The workaround will set mbw_min as 1% which is too small and will cause
> performance degradation, e.g. about 20% degradation on some benchmarks.
>
> This patch itself doesn't have any issue.
>
> The issue is the mbw_max feature bit in cfg is not set.

This is intended behaviour as the reset is done independently
from the value set in the config. The value is there so that
resctrl can display the expected values.

> This is a legacy issue, not introduced by this patch set.
> > Here is a fix patch for the issue:
> https://lore.kernel.org/lkml/20260301171829.1357886-1-
> fenghuay@xxxxxxxxxx/T/#u

I've commented on that patch. I think it's best to fix it in the context
of the erratum.

Does the below solve your performance problems?

diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 236f78ab9163..60d3d3e2193f 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -1515,16 +1515,20 @@ static u16 mpam_wa_t241_force_mbw_min_to_one(struct mpam_props *props)
return min_hw_granule + 1;
}

-static u16 mpam_wa_t241_calc_min_from_max(struct mpam_config *cfg)
+static u16 mpam_wa_t241_calc_min_from_max(struct mpam_props *props,
+ struct mpam_config *cfg)
{
u16 val = 0;
+ u16 max;
+ u16 delta = ((5 * MPAMCFG_MBW_MAX_MAX) / 100) - 1;

- if (mpam_has_feature(mpam_feat_mbw_max, cfg)) {
- u16 delta = ((5 * MPAMCFG_MBW_MAX_MAX) / 100) - 1;
+ if (mpam_has_feature(mpam_feat_mbw_max, cfg))
+ max = cfg->mbw_max;
+ else
+ max = GENMASK(15, 16 - cprops->bwa_wd);

- if (cfg->mbw_max > delta)
- val = cfg->mbw_max - delta;
- }
+ if (max > delta)
+ val = max - delta;

return val;
}
@@ -1577,9 +1581,8 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
if (mpam_has_quirk(T241_FORCE_MBW_MIN_TO_ONE, msc)) {
u16 min = mpam_wa_t241_force_mbw_min_to_one(rprops);

- val = mpam_wa_t241_calc_min_from_max(cfg);
- if (val < min)
- val = min;
+ val = mpam_wa_t241_calc_min_from_max(rprops, cfg);
+ val = max(val, min);
}

mpam_write_partsel_reg(msc, MBW_MIN, val);

>
> If the fix patch is good, could you please add it into the next version
> of this series?
>
> Thanks.
>
> -Fenghua

Thanks,

Ben