Re: [PATCH v2 1/8] sched/fair: Update affine statistics when needed

From: Srikar Dronamraju
Date: Fri May 07 2021 - 13:06:19 EST


* Valentin Schneider <valentin.schneider@xxxxxxx> [2021-05-07 17:08:17]:

> On 06/05/21 22:15, Srikar Dronamraju wrote:
> > wake_affine_idle() can return prev_cpu. Even in such a scenario,
> > scheduler was going ahead and updating schedstats related to wake
> > affine. i.e even if the task is not moved across LLC domains,
> > schedstats would have accounted.
> >
> > Hence add a check before updating schedstats.
> >

Thanks Valentin for taking a look at the patch.

>
> I briefly glanced at the git history but didn't find any proper description
> of that stat. As it stands, it counts the number of times wake_affine()
> purposedly steered a task towards a particular CPU (waker or wakee's prev),
> so nr_wakeups_affine / nr_wakeups_affine_attempts is your wake_affine()
> "success rate" - how often could it make a choice with the available data.
>
> I could see a point in only incrementing the count if wake_affine() steers
> towards the waker rather than the wakee (i.e. don't increment if choice is
> prev), but then that has no link with LLC spans

Lets say if prev CPU and this CPU were part of the same LLC, and the prev
CPU was busy (or busier than this CPU), should consider this as a wake
affine? If prev was idle, we would have surely consider prev CPU. Also since
both are part of same LLC, we cant say this CPU is more affine than prev
CPU. Or may be I am confusing wake_affine with cache_affine.

>
> > Cc: LKML <linux-kernel@xxxxxxxxxxxxxxx>
> > Cc: Gautham R Shenoy <ego@xxxxxxxxxxxxxxxxxx>
> > Cc: Parth Shah <parth@xxxxxxxxxxxxx>
> > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Cc: Valentin Schneider <valentin.schneider@xxxxxxx>
> > Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
> > Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> > Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> > Cc: Rik van Riel <riel@xxxxxxxxxxx>
> > Signed-off-by: Srikar Dronamraju <srikar@xxxxxxxxxxxxxxxxxx>
> > ---
> > kernel/sched/fair.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 794c2cb945f8..a258a84cfdfd 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -5884,8 +5884,10 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p,
> > if (target == nr_cpumask_bits)
> > return prev_cpu;
> >
> > - schedstat_inc(sd->ttwu_move_affine);
> > - schedstat_inc(p->se.statistics.nr_wakeups_affine);
> > + if (!cpus_share_cache(prev_cpu, target)) {
>
> Per the above, why? Why not just if(target == this_cpu) ?

We could use target == this_cpu. However if prev CPU and this CPU share the
same LLC, then should we consider moving to this_cpu as an affine wakeup?

I could have probably moved this patch a later in the patch series, but one
of the patch that introduces wake_affine_idler_llc() may end up returning
neither this_cpu, prev_cpu or nr_cpumask_bits. In such a case where it
returns a CPU closer to this_cpu, then I would still mark it as wake_affine.

>
> > + schedstat_inc(sd->ttwu_move_affine);
> > + schedstat_inc(p->se.statistics.nr_wakeups_affine);
> > + }
> > return target;
> > }
> >
> > --
> > 2.18.2

--
Thanks and Regards
Srikar Dronamraju