Re: [lkp-robot] [sched/fair] 6d46bd3d97: netperf.Throughput_tps -11.3% regression

From: Rik van Riel
Date: Thu Sep 14 2017 - 11:56:52 EST

On Sun, 2017-09-10 at 23:32 -0700, Joel Fernandes wrote:
> To make the load check more meaningful, I am thinking if using
> wake_affine()'s balance check is a better thing to do than the
> 'nr_running < 2' check I used in this patch. Then again, since commit
> 3fed382b46baac ("sched/numa: Implement NUMA node level
> wake_affine()",
> wake_affine() doesn't do balance check for CPUs within a socket so
> probably bringing back something like the *old* wake_affine that
> checked load between different CPUs within a socket is needed to
> avoid
> a potentially disastrous sync decision?Â

This is because regardless of whether or not we did
an affine wakeup, the code called select_idle_sibling
within that socket, anyway.

In other words, the behavior for within-socket
wakeups was not substantially different with or
without an affine wakeup.

All that changed is which CPU select_idle_sibling
starts searching at, and that only if the woken
task's previous CPU is not idle.

> ÂThe commit I refer to was
> added with the reason that select_idle_sibling was selecting cores
> anywhere within a socket, but with my patch we're more specifically
> selecting the waker's CPU on passing the sync flag. Could you share
> your thoughts about this?

On systems with SMT, it may make more sense for
sync wakeups to look for idle threads of the same
core, than to have the woken task end up on the
same thread, and wait for the current task to stop

"Strong sync" wakeups like you propose would also
change the semantics of wake_wide() and potentially
other bits of code...

All rights reversed

Attachment: signature.asc
Description: This is a digitally signed message part