Re: [PATCH v3] x86/resctrl: mba_MBps: Fall back to total b/w if local b/w unavailable
From: Reinette Chatre
Date: Wed Nov 08 2023 - 16:49:55 EST
Hi Tony,
On 11/7/2023 1:15 PM, Tony Luck wrote:
> On Fri, Nov 03, 2023 at 02:43:15PM -0700, Reinette Chatre wrote:
>> On 10/26/2023 1:02 PM, Tony Luck wrote:
>>> If local bandwidth measurement is not available, do not give up on
>>> providing the "mba_MBps" feedback option completely, make the code fall
>>> back to using total bandwidth.
>>
>> It is interesting to me that the "fall back" is essentially a drop-in
>> replacement without any adjustments to the data/algorithm.
>
> The algorithm is, by necessity, very simple. Essentially "if measured
> bandwidth is above desired target, apply one step extra throttling.
> Reverse when bandwidth is below desired level." I'm not sure what tweaks
> are possible.
>
>> Can these measurements be considered equivalent? Could a user now perhaps
>> want to experiment by disabling local bandwidth measurement to explore if
>> system behaves differently when using total memory bandwidth? What
>> would have a user choose one over the other (apart from when user
>> is forced by system ability)?
>
> This may be interesting. I dug around in the e-mail archives to see if
> there was any discussion on why "local" was picked as the feedback
> measurement rather that "total". But I couldn't find anything.
>
> Thinking about it now, "total" feels like a better choice. Why would
> you not care about off-package memory bandwidth? In pathological cases
> all the memory traffic might be going off package, but the existing
> mba_MBps algorithm would *reduce* the amount of throttling, eventually
> to zero.
>
> Maybe additional an mount option "mba_MBps_total" so the user can pick
> total instead of local?
Is this something for which a remount is required? Can it not perhaps be
changed at runtime?
>
>>>
>>> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
>>> ---
>>> Change since v2:
>>>
>>> Babu doesn't like the global variable. So here's a version without it.
>>>
>>> Note that my preference is still the v2 version. But as I tell newbies
>>> to Linux "Your job isn't to get YOUR patch upstream. You job is to get
>>> the problem fixed.". So taking my own advice I don't really mind
>>> whether v2 or v3 is applied.
>>>
>>> arch/x86/kernel/cpu/resctrl/monitor.c | 43 ++++++++++++++++++--------
>>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
>>> 2 files changed, 31 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>>> index f136ac046851..29e86310677d 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>>> @@ -418,6 +418,20 @@ static int __mon_event_count(u32 rmid, struct rmid_read *rr)
>>> return 0;
>>> }
>>>
>>> +/*
>>> + * For legacy compatibility use the local memory bandwidth to drive
>>> + * the mba_MBps feedback control loop. But on platforms that do not
>>> + * provide the local event fall back to use the total bandwidth event
>>> + * instead.
>>> + */
>>> +static enum resctrl_event_id pick_mba_mbps_event(void)
>>> +{
>>> + if (is_mbm_local_enabled())
>>> + return QOS_L3_MBM_LOCAL_EVENT_ID;
>>> +
>>> + return QOS_L3_MBM_TOTAL_EVENT_ID;
>>> +}
>>
>> Can there be a WARN here to catch the unlikely event that
>> !is_mbm_total_enabled()?
>> This may mean the caller (in update_mba_bw()) needs to move
>> to code protected by is_mbm_enabled().
>
> All this code is under the protection of the check at mount time
> done by supports_mba_mbps()
>
> static bool supports_mba_mbps(void)
> {
> struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
>
> return (is_mbm_enabled() &&
> r->alloc_capable && is_mba_linear());
> }
>
> Adding even more run-time checks seems overkill.
Refactoring the code into a function but then implicitly assume and
require that the function be called in specific flows on systems with
particular environment does not sound appealing to me.
Another alternative, since only one caller of this function remains,
is to remove this function and instead open code it within update_mba_bw(),
replacing the is_mbm_enabled() call.
Reinette