[PATCH v3 0/4] sched/pelt: Relax the sync of *_sum with *_avg

From: Vincent Guittot
Date: Tue Jan 11 2022 - 08:47:14 EST


Rick reported performance regressions in bugzilla because of cpu
frequency being lower than before:
https://bugzilla.kernel.org/show_bug.cgi?id=215045

He bisected the problem to:
commit 1c35b07e6d39 ("sched/fair: Ensure _sum and _avg values stay consistent")

More details are available in commit message of patch 1.

This patchset reverts the commit above and adds several checks when
propagating the changes in the hierarchy to make sure that we still have
coherent util_avg and util_sum.

Dietmar found a simple way to reproduce the WARN fixed by
commit 1c35b07e6d39 ("sched/fair: Ensure _sum and _avg values stay consistent")
by looping on hackbench in several different sched group levels.

This patchset as run on the reproducer with success but it probably needs
more tests by people who faced the WARN before.

The changes done on util_sum have been also applied to runnable_sum and
load_sum which faces the same rounding problem although this has not been
reflected in measurable performance impact.

Changes for v3:
- split patch 1 in 2 patches
- One to fix rick's regression
- One to apply same changes in other places
- some typos
- move main comment so it appears in the 1st patch

Changes for v2:
- fix wrong update of load_sum
- move a change from patch 3 to patch 2
- update patch 3 commit message

Vincent Guittot (4):
sched/pelt: Relax the sync of util_sum with util_avg
sched/pelt: Continue to relax the sync of util_sum with util_avg
sched/pelt: Relax the sync of runnable_sum with runnable_avg
sched/pelt: Relax the sync of load_sum with load_avg

kernel/sched/fair.c | 113 +++++++++++++++++++++++++++++---------------
1 file changed, 75 insertions(+), 38 deletions(-)

--
2.17.1