Re: [PATCH v4 00/10] sched/fair: rework the CFS load balance
From: Phil Auld
Date: Wed Oct 30 2019 - 13:45:12 EST
On Wed, Oct 30, 2019 at 06:28:50PM +0100 Vincent Guittot wrote:
> On Wed, 30 Oct 2019 at 18:19, Phil Auld <pauld@xxxxxxxxxx> wrote:
> >
> > Hi,
> >
> > On Wed, Oct 30, 2019 at 05:35:55PM +0100 Valentin Schneider wrote:
> > >
> > >
> > > On 30/10/2019 17:24, Dietmar Eggemann wrote:
> > > > On 30.10.19 15:39, Phil Auld wrote:
> > > >> Hi Vincent,
> > > >>
> > > >> On Mon, Oct 28, 2019 at 02:03:15PM +0100 Vincent Guittot wrote:
> > > >
> > > > [...]
> > > >
> > > >>>> When you say slow versus fast wakeup paths what do you mean? I'm still
> > > >>>> learning my way around all this code.
> > > >>>
> > > >>> When task wakes up, we can decide to
> > > >>> - speedup the wakeup and shorten the list of cpus and compare only
> > > >>> prev_cpu vs this_cpu (in fact the group of cpu that share their
> > > >>> respective LLC). That's the fast wakeup path that is used most of the
> > > >>> time during a wakeup
> > > >>> - or start to find the idlest CPU of the system and scan all domains.
> > > >>> That's the slow path that is used for new tasks or when a task wakes
> > > >>> up a lot of other tasks at the same time
> > > >
> > > > [...]
> > > >
> > > > Is the latter related to wake_wide()? If yes, is the SD_BALANCE_WAKE
> > > > flag set on the sched domains on your machines? IMHO, otherwise those
> > > > wakeups are not forced into the slowpath (if (unlikely(sd))?
> > > >
> > > > I had this discussion the other day with Valentin S. on #sched and we
> > > > were not sure how SD_BALANCE_WAKE is set on sched domains on
> > > > !SD_ASYM_CPUCAPACITY systems.
> > > >
> > >
> > > Well from the code nobody but us (asymmetric capacity systems) set
> > > SD_BALANCE_WAKE. I was however curious if there were some folks who set it
> > > with out of tree code for some reason.
> > >
> > > As Dietmar said, not having SD_BALANCE_WAKE means you'll never go through
> > > the slow path on wakeups, because there is no domain with SD_BALANCE_WAKE for
> > > the domain loop to find. Depending on your topology you most likely will
> > > go through it on fork or exec though.
> > >
> > > IOW wake_wide() is not really widening the wakeup scan on wakeups using
> > > mainline topology code (disregarding asymmetric capacity systems), which
> > > sounds a bit... off.
> >
> > Thanks. It's not currently set. I'll set it and re-run to see if it makes
> > a difference.
>
> Because the fix only touches the slow path and according to Valentin
> and Dietmar comments on the wake up path, it would mean that your UC
> creates regularly some new threads during the test ?
>
I believe it is not creating any new threads during each run.
> >
> >
> > However, I'm not sure why it would be making a difference for only the cgroup
> > case. If this is causing issues I'd expect it to effect both runs.
> >
> > In general I think these threads want to wake up the last cpu they were on.
> > And given there are fewer cpu bound tasks that CPUs that wake cpu should,
> > more often than not, be idle.
> >
> >
> > Cheers,
> > Phil
> >
> >
> >
> > --
> >
--