Re: Very high scheduling delay with plenty of idle CPUs
From: Vincent Guittot
Date: Mon Nov 11 2024 - 14:12:25 EST
On Mon, 11 Nov 2024 at 20:01, Vincent Guittot
<vincent.guittot@xxxxxxxxxx> wrote:
>
> On Mon, 11 Nov 2024 at 19:24, Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> >
> > On Mon, Nov 11, 2024 at 2:41 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > On Sun, Nov 10, 2024 at 10:15:07PM -0800, Saravana Kannan wrote:
> > >
> > > > I actually quickly hacked up the cpu_overutilized() function to return
> > > > true during suspend/resume and the threads are nicely spread out and
> > > > running in parallel. That actually reduces the total of the
> > > > dpm_resume*() phases from 90ms to 75ms on my Pixel 6.
> > >
> > > Right, so that kills EAS and makes it fall through to the regular
> > > select_idle_sibling() thing.
> > >
> > > > Peter,
> > > >
> > > > Would you be open to the scheduler being aware of
> > > > dpm_suspend*()/dpm_resume*() phases and triggering the CPU
> > > > overutilized behavior during these phases? I know it's a very use case
> > > > specific behavior but how often do we NOT want to speed up
> > > > suspend/resume? We can make this a CONFIG or a kernel command line
> > > > option -- say, fast_suspend or something like that.
> > >
> > > Well, I don't mind if Vincent doesn't. It seems like a very
> > > specific/targeted thing and should not affect much else, so it is a
> > > relatively safe thing to do.
> > >
> > > Perhaps a more direct hack in is_rd_overutilized() would be even less
> > > invasive, changing cpu_overutilized() relies on that getting propagated
> > > to rd->overutilized, might as well skip that step, no?
> >
> > is_rd_overutilized() sounds good to me. Outside of setting a flag in
>
> At know I'm not convinced that this is a solution but just a quick
> hack for your problem. We must understand 1st what is wrong
And you should better switch to performance cpufreq governor to
disable eas and run at max freq if your further wants to decrease
latency
>
> > sched.c that the suspend/resume code sets/clears, I can't think of an
> > interface that's better at avoiding abuse. Let me know if you have
> > any. Otherwise, I'll just go with the flag option. If Vincent gets the
> > scheduler to do the right thing without this, I'll happily drop this
> > targeted hack.
> >
> > -Saravana