Re: [PATCH V3 1/2] sched: Reduce the default slice to avoid tasks getting an extra tick
From: Vincent Guittot
Date: Tue Feb 25 2025 - 05:18:26 EST
On Tue, 25 Feb 2025 at 02:29, Vincent Guittot
<vincent.guittot@xxxxxxxxxx> wrote:
>
> On Tue, 25 Feb 2025 at 01:25, Qais Yousef <qyousef@xxxxxxxxxxx> wrote:
> >
> > On 02/24/25 15:15, Vincent Guittot wrote:
> > > On Mon, 10 Feb 2025 at 10:13, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > > >
> > > > On Mon, Feb 10, 2025 at 01:29:31AM +0000, Qais Yousef wrote:
> > > >
> > > > > I brought the topic up of these magic values with Peter and Vincent in LPC as
> > > > > I think this logic is confusing. I have nothing against your patch, but if the
> > > > > maintainers agree I am in favour of removing it completely in favour of setting
> > > > > it to a single value that is the same across all systems.
> > > >
> > > > You're talking about the scaling, right?
> > > >
> > > > Yeah, it is of limited use. The cap at 8, combined with the fact that
> > > > its really hard to find a machine with less than 8 CPUs on, makes the
> > > > whole thing mostly useless.
> > > >
> > > > Back when we did this, we still had dual-core laptops. Now phones have
> > > > 8 or more CPUs on.
> > > >
> > > > So I don't think I mind ripping it out.
> > >
> > > Beside the question of ripping it out or not. We still have a number
> > > of devices with less than 8 cores but they are not targeting phones,
> > > laptops or servers ...
> >
> > I'm not sure if this is in favour or against the rip out, or highlighting a new
> > problem. But in case it is against the rip-out, hopefully my answer in [1]
>
> My comment was only about the fact that assuming that systems now have
> 8 cpus or more so scaling doesn't make any real diff at the end is not
> really true.
>
> > highlights why the relationship to CPU number is actually weak and not really
> > helping much - I think it is making implicit assumptions about the workloads and
> > I don't think this holds anymore. Ignore me otherwise :-)
>
> Then regarding the scaling factor, I don't have a strong opinion but I
> would not be so definitive about its uselessness as there are few
> things to take into account:
> - From a scheduling PoV, the scheduling delay is impacted by largeer
> slices on devices with small number of CPUs even for light loaded
> cases
> - 1000 HZ with 1ms slice will generate 3 times more context switch
> than 2.8ms in a steady loaded case and if some people were concerned
> but using 1000hz by default, we will not feel better with 1ms slice
Figures showing that there is no major regression to use a base slice
< 1ms everywhere would be a good starting point.
Some slight performance regression has just been reported for this
patch which moves base slice from 3ms down to 2.8ms [1].
[1] https://lore.kernel.org/lkml/202502251026.bb927780-lkp@xxxxxxxxx/
> - 1ms is not a good value. In fact anything which is a multiple of the
> tick is not a good number as the actual time accounted to the task is
> usually less than the tick
> - And you can always set the scaling to none with tunable_scaling to
> get a fixed 0.7ms default slice whatever the number of CPUs
>
> >
> > FWIW a raspberry PI can be used as a server, a personal computer, a multimedia
> > entertainment system, a dumb sensor recorder/relayer or anything else. I think
> > most systems expect to run a variety of workloads and IMHO the fact the system
> > is overloaded and we need a reasonable default base_slice to ensure timely
> > progress of all running tasks has little relation to NR_CPUs nowadays.
> >
> > [1] https://lore.kernel.org/all/20250210230500.53mybtyvzhdagot5@airbuntu/