Re: [PATCH RESEND] sched: prefer an idle cpu vs an idle sibling for BALANCE_WAKE

From: Mike Galbraith
Date: Thu Jun 18 2015 - 00:13:03 EST


On Wed, 2015-06-17 at 20:46 -0700, Josef Bacik wrote:
> On 06/17/2015 05:55 PM, Mike Galbraith wrote:
> > On Wed, 2015-06-17 at 11:06 -0700, Josef Bacik wrote:
> >> On 06/11/2015 10:35 PM, Mike Galbraith wrote:
> >>> On Thu, 2015-05-28 at 13:05 +0200, Peter Zijlstra wrote:
> >
> >>> If sd == NULL, we fall through and try to pull wakee despite nacked-by
> >>> tsk_cpus_allowed() or wake_affine().
> >>>
> >>
> >> So maybe add a check in the if (sd_flag & SD_BALANCE_WAKE) for something
> >> like this
> >>
> >> if (tmp >= 0) {
> >> new_cpu = tmp;
> >> goto unlock;
> >> } else if (!want_affine) {
> >> new_cpu = prev_cpu;
> >> }
> >>
> >> so we can make sure we're not being pushed onto a cpu that we aren't
> >> allowed on? Thanks,
> >
> > The buglet is a messenger methinks. You saying the patch helped without
> > SD_BALANCE_WAKE being set is why I looked. The buglet would seem to say
> > that preferring cache is not harming your load after all. It now sounds
> > as though wake_wide() may be what you're squabbling with.
> >
> > Things aren't adding up all that well.
>
> Yeah I'm horribly confused. The other thing is I had to switch clusters
> (I know, I know, I'm changing the parameters of the test). So these new
> boxes are haswell boxes, but basically the same otherwise, 2 socket 12
> core with HT, just newer/faster CPUs. I'll re-run everything again and
> give the numbers so we're all on the same page again, but as it stands
> now I think we have this
>
> 3.10 with wake_idle forward ported - good
> 4.0 stock - 20% perf drop
> 4.0 w/ Peter's patch - good
> 4.0 w/ Peter's patch + SD_BALANCE_WAKE - 5% perf drop
>
> I can do all these iterations again to verify, is there any other
> permutation you'd like to see? Thanks,

Yeah, after re-baseline, please apply/poke these buttons individually in
4.0-virgin.

(cat /sys/kernel/debug/sched_features, prepend NO_, echo it back)

---
kernel/sched/fair.c | 4 ++--
kernel/sched/features.h | 2 ++
2 files changed, 4 insertions(+), 2 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4506,7 +4506,7 @@ static int wake_affine(struct sched_doma
* If we wake multiple tasks be careful to not bounce
* ourselves around too much.
*/
- if (wake_wide(p))
+ if (sched_feat(WAKE_WIDE) && wake_wide(p))
return 0;

idx = sd->wake_idx;
@@ -4682,7 +4682,7 @@ static int select_idle_sibling(struct ta
struct sched_group *sg;
int i = task_cpu(p);

- if (idle_cpu(target))
+ if (!sched_feat(PREFER_IDLE) || idle_cpu(target))
return target;

/*
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -59,6 +59,8 @@ SCHED_FEAT(TTWU_QUEUE, true)
SCHED_FEAT(FORCE_SD_OVERLAP, false)
SCHED_FEAT(RT_RUNTIME_SHARE, true)
SCHED_FEAT(LB_MIN, false)
+SCHED_FEAT(PREFER_IDLE, true)
+SCHED_FEAT(WAKE_WIDE, true)

/*
* Apply the automatic NUMA scheduling policy. Enabled automatically


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/