Re: [PATCH 2/2] sched: Rewrite per entity runnable load average tracking

From: Yuyang Du
Date: Thu Jul 10 2014 - 03:33:48 EST

Thanks, Peter.

On Wed, Jul 09, 2014 at 08:45:43PM +0200, Peter Zijlstra wrote:

> Nope :-).. we got rid of that lock for a good reason.
> Also, this is one area where I feel performance really trumps
> correctness, we can fudge the blocked load a little. So the
> sched_clock_cpu() difference is a strict upper bound on the
> rq_clock_task() difference (and under 'normal' circumstances shouldn't
> be much off).

Strictly, migrating wakee task on remote CPU entails two steps:

(1) Catch up with task's queue's last_update_time, and then substract

(2) Cache up with "current" time of remote CPU (for comparable matter), and then
on new CPU, change to the new timing source (when enqueue)

So I will try sched_clock_cpu(remote_cpu) for step (2). For step (2), maybe we
should not use cfs_rq_clock_task anyway, since the task is about to going
to another CPU/queue. Is this right?

I made another mistake. Should not only track task entity load, group entity
(as an entity) is also needed. Otherwise, task_h_load can't be done correctly...
Sorry for the messup. But this won't make much change in the codes.


> So we could simply use a timestamps from dequeue and one from enqueue,
> and use that.
> As to the remote subtraction, a RMW on another cacheline than the
> rq->lock one should be good; esp since we don't actually observe the
> per-rq total often (once per tick or so) I think, no?
> The thing is, we do not want to disturb scheduling on whatever cpu the
> task last ran on if we wake it to another cpu. Taking rq->lock wrecks
> that for sure.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at