Re: [PATCH -next] sched: Dec __cfs_bandwith_used in destroy_cfs_bandwidth()

From: Daniel Jordan
Date: Mon Jul 12 2021 - 12:27:28 EST


[cc more, leaving full context]

On Tue, Jul 06, 2021 at 04:38:20PM +0800, Zhang Qiao wrote:
> __cfs_bandwith_uesd is a static_key to control cfs bandwidth
> feature. When adding a cfs_bandwidth group, we need increase
> the key, and decrease it when removing. But currently when we
> remove a cfs_bandwidth group, we don't decrease the key and
> this switch will always be on even if there is no cfs bandwidth
> group in the system.

Yep, that's broken.

> Therefore, when removing a cfs bandwidth group, we decrease
> __cfs_bandwith_used by calling cfs_bandwidth_usage_dec().
>
> Fixes: 56f570e512ee ("sched: use jump labels to reduce overhead when bandwidth control is inactive")
> Signed-off-by: Zhang Qiao <zhangqiao22@xxxxxxxxxx>
> ---
> kernel/sched/fair.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 103e31e53e2b..857e8908b7f7 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5344,6 +5344,9 @@ static void destroy_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
> if (!cfs_b->throttled_cfs_rq.next)
> return;
>
> + if (cfs_b->quota != RUNTIME_INF)
> + cfs_bandwidth_usage_dec();

This calls static_key_slow_dec_cpuslocked, but destroy_cfs_bandwidth
isn't holding the hotplug lock.

The other caller of cfs_bandwidth_usage_dec needs to hold it for another
reason, so what about having both cfs_bandwidth_usage_dec() and
cfs_bandwidth_usage_dec_cpuslocked()? In that case, the _inc one could
be renamed similarly so this isn't a stumbling block later on.

> +
> hrtimer_cancel(&cfs_b->period_timer);
> hrtimer_cancel(&cfs_b->slack_timer);
> }
> --
> 2.18.0.huawei.25
>