Re: [PATCH v3] sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug

From: Vishal Chourasia
Date: Tue Dec 10 2024 - 10:38:38 EST

Next message: Geert Uytterhoeven: "Re: [PATCH v2 14/15] arm64: dts: renesas: r9a08g045: Add USB support"
Previous message: Johan Hovold: "Re: [PATCH 0/8] arm64: dts: qcom: x1e*: Fix USB QMP PHY supplies"
In reply to: Peter Zijlstra: "Re: [PATCH v3] sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Dec 10, 2024 at 03:43:07PM +0100, Peter Zijlstra wrote:
> On Tue, Dec 10, 2024 at 03:53:47PM +0530, Vishal Chourasia wrote:
> > CPU controller limits are not properly enforced during CPU hotplug
> > operations, particularly during CPU offline. When a CPU goes offline,
> > throttled processes are unintentionally being unthrottled across all CPUs
> > in the system, allowing them to exceed their assigned quota limits.
> >
> > Consider below for an example,
> >
> > Assigning 6.25% bandwidth limit to a cgroup
> > in a 8 CPU system, where, workload is running 8 threads for 20 seconds at
> > 100% CPU utilization, expected (user+sys) time = 10 seconds.
> >
> > $ cat /sys/fs/cgroup/test/cpu.max
> > 50000 100000
> >
> > $ ./ebizzy -t 8 -S 20 // non-hotplug case
> > real 20.00 s
> > user 10.81 s // intended behaviour
> > sys 0.00 s
> >
> > $ ./ebizzy -t 8 -S 20 // hotplug case
> > real 20.00 s
> > user 14.43 s // Workload is able to run for 14 secs
> > sys 0.00 s // when it should have only run for 10 secs
> >
> > During CPU hotplug, scheduler domains are rebuilt and cpu_attach_domain
> > is called for every active CPU to update the root domain. That ends up
> > calling rq_offline_fair which un-throttles any throttled hierarchies.
> >
> > Unthrottling should only occur for the CPU being hotplugged to allow its
> > throttled processes to become runnable and get migrated to other CPUs.
> >
> > With current patch applied,
> > $ ./ebizzy -t 8 -S 20 // hotplug case
> > real 21.00 s
> > user 10.16 s // intended behaviour
> > sys 0.00 s
> >
> > Note: hotplug operation (online, offline) was performed in while(1) loop
> >
> > Signed-off-by: Vishal Chourasia <vishalc@xxxxxxxxxxxxx>
> > Tested-by: Madadi Vineeth Reddy <vineethr@xxxxxxxxxxxxx>
>
> Did you mean this?
Yes, essentially this.
I will post another version.

>··
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2c4ebfc82917..b6afb8337e73 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6696,6 +6696,9 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
>
> lockdep_assert_rq_held(rq);
>
> + if (cpumask_test_cpu(cpu_of(rq), cpu_active_mask))
> + return;
> +
> /*
> * The rq clock has already been updated in the
> * set_rq_offline(), so we should skip updating

What should be done for the case when the hotplugged CPU's cfs_rq has
plenty of runtime_remaining?

I have three choices
1) set it to 1 (no change required in current code)
2) skip reset, runtime_remaining will not be touched (similar to current patch)
3) return excess runtime to the global runtime (will require taking lock)

Thanks
- vishalc

Next message: Geert Uytterhoeven: "Re: [PATCH v2 14/15] arm64: dts: renesas: r9a08g045: Add USB support"
Previous message: Johan Hovold: "Re: [PATCH 0/8] arm64: dts: qcom: x1e*: Fix USB QMP PHY supplies"
In reply to: Peter Zijlstra: "Re: [PATCH v3] sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]