Re: [PATCH v2 2/2] sched/core: Avoid unnecessary update in tg_set_cfs_bandwidth
From: Benjamin Segall
Date: Tue Jul 23 2024 - 21:27:20 EST
Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx> writes:
> In the kubernetes production environment, we have observed a high
> frequency of writes to cpu.max, approximately every 2~4 seconds for each
> cgroup, with the same value being written each time. This can result in
> unnecessary overhead, especially on machines with a large number of CPUs
> and cgroups.
>
> This is because kubelet and runc attempt to persist resource
> configurations through frequent updates with same value in this manner.
> While optimizations can be made to kubelet and runc to avoid such
> overhead(e.g. check the current value of cpu request/limit before writing
> to cpu.max), it is still worth to bail out from tg_set_cfs_bandwidth() if
> we attempt to update with the same value.
>
> Signed-off-by: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
Reviewed-by: Ben Segall <bsegall@xxxxxxxxxx>
> ---
> kernel/sched/core.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 7720d34bd71b..0cc564f45511 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -9090,6 +9090,9 @@ static int tg_set_cfs_bandwidth(struct task_group *tg, u64 period, u64 quota,
> guard(cpus_read_lock)();
> guard(mutex)(&cfs_constraints_mutex);
>
> + if (cfs_b->period == ns_to_ktime(period) && cfs_b->quota == quota && cfs_b->burst == burst)
> + return 0;
> +
> ret = __cfs_schedulable(tg, period, quota);
> if (ret)
> return ret;