Re: Very high scheduling delay with plenty of idle CPUs
From: Saravana Kannan
Date: Mon Nov 11 2024 - 13:31:07 EST
On Mon, Nov 11, 2024 at 3:15 AM Vincent Guittot
<vincent.guittot@xxxxxxxxxx> wrote:
>
> On Mon, 11 Nov 2024 at 11:41, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Sun, Nov 10, 2024 at 10:15:07PM -0800, Saravana Kannan wrote:
> >
> > > I actually quickly hacked up the cpu_overutilized() function to return
> > > true during suspend/resume and the threads are nicely spread out and
> > > running in parallel. That actually reduces the total of the
> > > dpm_resume*() phases from 90ms to 75ms on my Pixel 6.
> >
> > Right, so that kills EAS and makes it fall through to the regular
> > select_idle_sibling() thing.
> >
> > > Peter,
> > >
> > > Would you be open to the scheduler being aware of
> > > dpm_suspend*()/dpm_resume*() phases and triggering the CPU
> > > overutilized behavior during these phases? I know it's a very use case
> > > specific behavior but how often do we NOT want to speed up
> > > suspend/resume? We can make this a CONFIG or a kernel command line
> > > option -- say, fast_suspend or something like that.
> >
> > Well, I don't mind if Vincent doesn't. It seems like a very
> > specific/targeted thing and should not affect much else, so it is a
> > relatively safe thing to do.
>
> I would like to understand why all idle little cpus are not used in
> saravana's example and tasks are packed on the same cpu instead.
If you want to try this on your end and debug it further, it should be
pretty easy to reproduce on a Pixel 6 even without my suspend/resume
changes.
Just run this on the device to mark all devices as async
suspend/resume. This assumes you have CONFIG_PM_DEBUG enabled.
find /sys/devices/ -name async | while read -r filename; do echo
enabled > "$filename"; done
And look at the dpm_resume_noirq() phase. You should see some kworkers
that are runnable but not running for a while while a little CPU is
idle. It should happen within a few tries. You need to unplug the USB
cable to let the device suspend and wait at least 10 seconds after the
screen goes off.
But even if you fix EAS to pick little CPUs, I think we also want to
use the mid and big CPUs. That's not going to happen right?
-Saravana
> >
> > Perhaps a more direct hack in is_rd_overutilized() would be even less
> > invasive, changing cpu_overutilized() relies on that getting propagated
> > to rd->overutilized, might as well skip that step, no?