Re: [RFC 1/2] sched: reduce migration cost between faster caches for idle_balance

From: Steven Sistare
Date: Thu Feb 15 2018 - 13:22:24 EST


On 2/15/2018 1:07 PM, Mike Galbraith wrote:
> On Thu, 2018-02-15 at 11:35 -0500, Steven Sistare wrote:
>> On 2/10/2018 1:37 AM, Mike Galbraith wrote:
>>> On Fri, 2018-02-09 at 11:08 -0500, Steven Sistare wrote:
>>>>>> @@ -8804,7 +8803,8 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>>>>>> if (!(sd->flags & SD_LOAD_BALANCE))
>>>>>> continue;
>>>>>>
>>>>>> - if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost) {
>>>>>> + if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost +
>>>>>> + sd->sched_migration_cost) {
>>>>>> update_next_balance(sd, &next_balance);
>>>>>> break;
>>>>>> }
>>>>>
>>>>> Ditto.
>>>>
>>>> The old code did not migrate if the expected costs exceeded the expected idle
>>>> time. The new code just adds the sd-specific penalty (essentially loss of cache
>>>> footprint) to the costs. The for_each_domain loop visit smallest to largest
>>>> sd's, hence visiting smallest to largest migration costs (though the tunables do
>>>> not enforce an ordering), and bails at the first sd where the total cost is a lose.
>>>
>>> Hrm..
>>>
>>> You're now adding a hypothetical cost to the measured cost of running
>>> the LB machinery, which implies that the measurement is insufficient,
>>> but you still don't say why it is insufficient.  What happens if you
>>> don't do that?  I ask, because when I removed the...
>>>
>>>    this_rq->avg_idle < sysctl_sched_migration_cost
>>>
>>> ...bits to check removal effect for Peter, the original reason for it
>>> being added did not re-materialize, making me wonder why you need to
>>> make this cutoff more aggressive.
>>
>> The current code with sysctl_sched_migration_cost discourages migration
>> too much, per our test results.
>
> That's why I asked you what happens if you only whack the _apparently_
> (but maybe not) obsolete old throttle, it appeared likely that your win
> came from allowing a bit more migration than the simple throttle
> allowed, which if true, would obviate the need for anything more.
>
>> Can you provide more details on the sysbench oltp test that motivated you
>> to add sysctl_sched_migration_cost to idle_balance, so Rohit can re-test it?
>
> The problem at that time was the cycle overhead of entering that LB
> path at high frequency.  Dirt simple.

I get that. I meant please provide details on test parameters and config if
you remember them.

- Steve