Re: [RFC][PATCH] sched: Avoid select_idle_sibling() for wake_affine(.sync=true)

From: Michael wang
Date: Thu Sep 26 2013 - 02:32:22 EST


On 09/26/2013 01:34 PM, Mike Galbraith wrote:
> On Thu, 2013-09-26 at 13:12 +0800, Michael wang wrote:
>> On 09/26/2013 11:41 AM, Mike Galbraith wrote:
>> [snip]
>>>> Like the case when we have:
>>>>
>>>> core0 sg core1 sg
>>>> cpu0 cpu1 cpu2 cpu3
>>>> waker busy idle idle
>>>>
>>>> If the sync wakeup was on cpu0, we can:
>>>>
>>>> 1. choose cpu in core1 sg like we did usually
>>>> some overhead but tend to make the load a little balance
>>>> core0 sg core1 sg
>>>> cpu0 cpu1 cpu2 cpu3
>>>> idle busy wakee idle
>>>
>>> Reducing latency and increasing throughput when the waker isn't really
>>> really going to immediately schedule off as the hint implies. Nice for
>>> bursty loads and ramp.
>>>
>>> The breakeven point is going up though. If you don't have nohz
>>> throttled, you eat tick start/stop overhead, and the menu governor
>>> recently added yet more overhead, so maybe we should say hell with it.
>>
>> Exactly, more and more factors to be considered, we say things get
>> balanced but actually it's not the best choice...
>>
>>>
>>>> 2. choose cpu0 like the patch proposed
>>>> no overhead but tend to make the load a little more unbalance
>>>> core0 sg core1 sg
>>>> cpu0 cpu1 cpu2 cpu3
>>>> wakee busy idle idle
>>>>
>>>> May be we should add a higher scope load balance check in wake_affine(),
>>>> but that means higher overhead which is just what the patch want to
>>>> reduce...
>>>
>>> Yeah, more overhead is the last thing we need.
>>>
>>>> What about some discount for sync case inside select_idle_sibling()?
>>>> For example we consider sync cpu as idle and prefer it more than the others?
>>>
>>> That's what the sync hint does. Problem is, it's a hint. If it were
>>> truth, there would be no point in calling select_idle_sibling().
>>
>> Just wondering if the hint was wrong in most of the time, then why don't
>> we remove it...
>
> For very fast/light network ping-pong micro-benchmarks, it is right.
> For pipe-test, it's absolutely right, jabbering parties are 100%
> synchronous, there is nada/nil/zip/diddly squat overlap reclaimable..
> but in the real world, it ain't necessarily so.
>
>> Otherwise I think we can still utilize it to make some decision tends to
>> be correct, don't we?
>
> Sometimes :)

Ok, a double-edged sword I see :)

May be we can wave it carefully here, give the discount to a bigger
scope not the sync cpu, for example:

sg1 sg2
cpu0 cpu1 cpu2 cpu3 cpu4 cpu5 cpu6 cpu7
waker idle idle idle idle idle idle idle

If it's sync wakeup on cpu0 (only waker), and the sg is wide enough,
which means one cpu is not so influencial, then suppose cpu0 to be idle
could be more safe, also prefer sg1 than sg2 is more likely to be right.

And we can still choose idle-cpu at final step, like cpu1 in this case,
to avoid the risk that waker don't get off as it said.

The key point is to reduce the influence of sync, trust a little but not
totally ;-)

Regards,
Michael Wang

>
> -Mike
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/