Re: [PATCH 2/2] sched: Rewrite per entity runnable load average tracking

From: bsegall
Date: Wed Jul 09 2014 - 15:07:21 EST

Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:

> On Wed, Jul 09, 2014 at 09:07:53AM +0800, Yuyang Du wrote:
>> That is chalenging... Can someone (Peter) grant us a lock of the remote rq? :)
> Nope :-).. we got rid of that lock for a good reason.
> Also, this is one area where I feel performance really trumps
> correctness, we can fudge the blocked load a little. So the
> sched_clock_cpu() difference is a strict upper bound on the
> rq_clock_task() difference (and under 'normal' circumstances shouldn't
> be much off).

Well, unless IRQ_TIME_ACCOUNTING or such is on, in which case you lose.
Or am I misunderstanding the suggestion? Actually the simplest thing
would probably be to grab last_update_time (which on 32-bit could be
done with the _copy hack) and use that. Then I think the accuracy is
only worse than current in that you can lose runnable load as well as
blocked load, and that it isn't as easily corrected - currently if the
blocked tasks wake up they'll add the correct numbers to
runnable_load_avg, even if blocked_load_avg is screwed up and hit zero.
This code would have to wait until it stabilized again.

> So we could simply use a timestamps from dequeue and one from enqueue,
> and use that.
> As to the remote subtraction, a RMW on another cacheline than the
> rq->lock one should be good; esp since we don't actually observe the
> per-rq total often (once per tick or so) I think, no?

Yeah, it's definitely a different cacheline, and the current code only
reads per-ms or on loadbalance migration.

> The thing is, we do not want to disturb scheduling on whatever cpu the
> task last ran on if we wake it to another cpu. Taking rq->lock wrecks
> that for sure.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at