Re: [PATCH v5 2/2] sched/fair: Simplify balance_interval reset logic in sched_balance_rq()

From: Peter Zijlstra

Date: Thu Jun 18 2026 - 05:46:26 EST


On Wed, Jun 17, 2026 at 03:21:51PM +0800, Xin Zhao wrote:
> In sched_balance_rq(), it is possible to call need_active_balance() twice
> in quick succession, which is not appropriate. There are two conditions in
> sched_balance_rq() that reset balance_interval to min_interval, one is
> when the local variable active_balance is 0, and the other is when
> need_active_balance() returns a non-zero value. The local variable
> active_balance is initialized to 0. Therefore, the only situation in which
> balance_interval NOT be reset to min_interval is if need_active_balance()
> has been executed once, marking the local variable active_balance as 1,
> and then the second call to need_active_balance() returns 0. In other
> words, the case is that during the interval between two close calls to
> need_active_balance(), busiest rq completes the recently dispatched active
> balance stop work, which is quite rare.
>
> There are mainly two scenarios that lead to reaching sched_balance_rq():
> one is the newly idle balance triggered by __schedule(), and the other is
> the periodic balance logic controlled by sd->balance_interval or
> nohz.next_balance, which ultimately executes in the softirq context. The
> vast majority of cases executing sched_balance_rq() is the first scenario.
> During the execution of __schedule(), preemption is disabled, so the
> interval between two checks of need_active_balance() will not be long.
> Thus, only in the second scenario, balance_interval may NOT be reset to
> min_interval, but it's still not likely. The second scenario is in softirq
> context, the execution of two need_active_balance() checks can be
> preempted by other tasks, leading to a longer interval between the two
> checks. However, there is no evidence to suggest that not resetting
> min_interval in these low-probability cases caused by scheduling
> preemption offers any significant benefits. It would be better to simplify
> this complex reset logic for balance_interval to an unconditional reset.

This is very confusing, and my AI helper isn't helping much this time
around.

active_balance is initialized 0, it is only (but not always) set 1 when
need_active_balance().

Therefore, the condition: !active_balance || need_active_balance() is a
truism and can be removed.

Or am I missing something more complicated?

> Signed-off-by: Xin Zhao <jackzxcui1989@xxxxxxx>
> ---
> kernel/sched/fair.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2b9653623..9c78241e9 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -13464,10 +13464,8 @@ static int sched_balance_rq(int this_cpu, struct rq *this_rq,
> sd->nr_balance_failed = 0;
> }
>
> - if (likely(!active_balance) || need_active_balance(&env)) {
> - /* We were unbalanced, so reset the balancing interval */
> - sd->balance_interval = sd->min_interval;
> - }
> + /* We were unbalanced, so reset the balancing interval */
> + sd->balance_interval = sd->min_interval;
>
> goto out;
>
> --
> 2.34.1
>