Re: [PATCH] x86/resctrl: Implement new MBA_mbps throttling heuristic

From: Xiaochen Shen
Date: Tue Jan 16 2024 - 22:37:16 EST


Hi Reinette and Tony,

On 1/17/2024 3:55, Reinette Chatre wrote:
Hi Xiaochen,

On 1/9/2024 2:00 PM, Tony Luck wrote:
The MBA_mbps feedback loop increases throttling when a group is using
more bandwidth than the target set by the user in the schemata file, and
decreases throttling when below target.

To avoid possibly stepping throttling up and down on every poll a
flag "delta_comp" is set whenever throttling is changed to indicate
that the actual change in bandwidth should be recorded on the next
poll in "delta_bw". Throttling is only reduced if the current bandwidth
plus delta_bw is below the user target.

This algorithm works well if the workload has steady bandwidth needs.
But it can go badly wrong if the workload moves to a different phase
just as the throttling level changed. E.g. if the workload becomes
essentially idle right as throttling level is increased, the value
calculated for delta_bw will be more or less the old bandwidth level.
If the workload then resumes, Linux may never reduce throttling because
current bandwidth plu delta_bw is above the target set by the user.

Implement a simpler heuristic by assuming that in the worst case the
currently measured bandwidth is being controlled by the current level of
throttling. Compute how much it may increase if throttling is relaxed to
the next higher level. If that is still below the user target, then it
is ok to reduce the amount of throttling.

Fixes: ba0f26d8529c ("x86/intel_rdt/mba_sc: Prepare for feedback loop")
Reported-by: Xiaochen Shen <xiaochen.shen@xxxxxxxxx>
Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
---

This patch was previously posted in:

https://lore.kernel.org/lkml/ZVU+L92LRBbJXgXn@agluck-desk3/#t

as part of a proposal to allow the mba_MBps mount option to base its
feedback loop input on total bandwidth instead of local bandwidth.
Sending it now as a standalone patch because Xiaochen reported that
real systems have experienced problems when delta_bw is incorrectly
calculated.

Does this new heuristic fix the problems you observe? If so, would you be
willing to provide a "Tested-by" tag?

Yes. The patch fixed the problem. I will reply to the original thread to add "Tested-by" tag.
Thank you very much for help!


Thank you.

Reinette

Best regards,
Xiaochen