Re: Very high scheduling delay with plenty of idle CPUs
From: Vincent Guittot
Date: Mon Nov 11 2024 - 14:02:02 EST
On Mon, 11 Nov 2024 at 19:17, Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
>
> On Mon, Nov 11, 2024 at 3:15 AM Vincent Guittot
> <vincent.guittot@xxxxxxxxxx> wrote:
> >
> > On Mon, 11 Nov 2024 at 11:41, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > On Sun, Nov 10, 2024 at 10:15:07PM -0800, Saravana Kannan wrote:
> > >
> > > > I actually quickly hacked up the cpu_overutilized() function to return
> > > > true during suspend/resume and the threads are nicely spread out and
> > > > running in parallel. That actually reduces the total of the
> > > > dpm_resume*() phases from 90ms to 75ms on my Pixel 6.
> > >
> > > Right, so that kills EAS and makes it fall through to the regular
> > > select_idle_sibling() thing.
> > >
> > > > Peter,
> > > >
> > > > Would you be open to the scheduler being aware of
> > > > dpm_suspend*()/dpm_resume*() phases and triggering the CPU
> > > > overutilized behavior during these phases? I know it's a very use case
> > > > specific behavior but how often do we NOT want to speed up
> > > > suspend/resume? We can make this a CONFIG or a kernel command line
> > > > option -- say, fast_suspend or something like that.
> > >
> > > Well, I don't mind if Vincent doesn't. It seems like a very
> > > specific/targeted thing and should not affect much else, so it is a
> > > relatively safe thing to do.
> >
> > I would like to understand why all idle little cpus are not used in
> > saravana's example and tasks are packed on the same cpu instead.
>
> If you want to try this on your end and debug it further, it should be
> pretty easy to reproduce on a Pixel 6 even without my suspend/resume
> changes.
You are using the v6.12-rc5 on Pixel6 ?
>
> Just run this on the device to mark all devices as async
> suspend/resume. This assumes you have CONFIG_PM_DEBUG enabled.
>
> find /sys/devices/ -name async | while read -r filename; do echo
> enabled > "$filename"; done
>
> And look at the dpm_resume_noirq() phase. You should see some kworkers
> that are runnable but not running for a while while a little CPU is
> idle. It should happen within a few tries. You need to unplug the USB
> cable to let the device suspend and wait at least 10 seconds after the
> screen goes off.
>
> But even if you fix EAS to pick little CPUs, I think we also want to
> use the mid and big CPUs. That's not going to happen right?
Who knows ?
Right now the trace that you shared clearly show a wrong behavior
>
> -Saravana
>
> > >
> > > Perhaps a more direct hack in is_rd_overutilized() would be even less
> > > invasive, changing cpu_overutilized() relies on that getting propagated
> > > to rd->overutilized, might as well skip that step, no?