Re: [question] sched: idle_avg and migration latency

From: Alex Shi
Date: Tue Dec 10 2013 - 10:20:13 EST


CC to MikeG, he written this part. :)
I try to explain sth I know. I am sorry if my understanding incorrect.

On 12/10/2013 07:30 PM, Daniel Lezcano wrote:
>
> Hi All,
>
> I am trying to understand how is computed the idle_avg and how it is
> used regarding the migration latency.
>
> 1. What is the sysctl_sched_migration_cost value ? It is initialized to
> 500000UL. Is it an arbitrarily chosen value ? Could it change depending
> on the hardware performances ?

current sysctl_sched_mirgration_cost is 0.5ms, used to limit
overscheduling. Guess it is a kind of arbitrary. But it can be rewrite
at /proc/sys/kernel/sched_migration_cost_ns.
So if you find some new suitable value in particular scenario. guess
PeterZ like to modify it. :)

>
>
> 2. The idle_balance function checks:
>
> if (this_rq->avg_idle < sysctl_sched_migration_cost)
> return 0;
>
> IIUC, it is not worth to migrate a task to this cpu as we expect to run
> another task before we can pull a task to the current cpu, right ?

No, that used to prevent every idle_balance cause a task migration if
idle balance happens too much and too quick, -- frequency more than task
migration limitation.
>
> Then if there is no task to balance we will enter idle, thus we
> initialize the idle_stamp to the current clock.

If we pulled task, we will restart frequency calculation by set
idle_stamp = 0;
or if new task adding this rq, allow more idle_balance.
>
> When another task is woken up with the ttwu_do_wakeup, the duration of
> the idle time is computed in there:
>
> if (rq->idle_stamp) {
> u64 delta = rq_clock(rq) - rq->idle_stamp;
> u64 max = 2*sysctl_sched_migration_cost;
>
> if (delta > max)
> rq->avg_idle = max;
> else
> update_avg(&rq->avg_idle, delta);
> rq->idle_stamp = 0;
> }
>
> Why is the 'delta' leveraged by 'max' ?
>
>
> 3. And finally the function update_avg does:
>
> s64 diff = sample - *avg;
> *avg += diff >> 3;
>
> Why is diff >> 3 used instead of the number of values ?

It is a kind of decay. but has no idea of why this value '3'. Guess
MikeG has some reason.
>
> Thanks in advance for any answers
>
> -- Daniel
>


--
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/