Re: [patch 3/3] sched: move sched_avg_update() to update_cpu_load()

From: Peter Zijlstra
Date: Mon Aug 16 2010 - 15:31:47 EST


On Mon, 2010-08-16 at 10:46 -0700, Suresh Siddha wrote:

> There is no guarantee that the original cpu won't be doing this in
> parallel with nohz idle load balancing cpu.

Hmm, true.. bugger.

> > > Fix it by moving the sched_avg_update() to more appropriate update_cpu_load()
> > > where the CFS load gets updated aswell.
> >
> > Right, except it breaks things a bit, at the very least you really need
> > that update right before reading it, otherwise you can end up with >100%
> > fractions, which are odd indeed ;-)
>
> with the patch, the update always happens before reading it. isn't it?
>
> update now happens during the scheduler tick (or during nohz load
> balancing tick). And the load balancer gets triggered with the tick.
> So the update (at the tick) should happen before reading it (used by
> load balancing triggered by the tick). Am I missing something?

We run the load-balancer in softirq context, on -rt that's a task, and
we could have ran other (more important) RT tasks between the hardirq
and the softirq running, which would increase the rt_avg and could thus
result in >100%.

But I think we can simply retain the sched_avg_update(rq) in
sched_rt_avg_update(), that is ran with rq->lock held and should be
enough to avoid that case.

We can retain the other bit of you patch, moving sched_avg_update() from
scale_rt_power() to update_cpu_load(), since that is only concerned with
lowering the average when there is no actual activity.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/