Re: SD_SHARE_CPUPOWER breaks scheduler fairness

From: Steve Rotolo
Date: Thu Jun 02 2005 - 08:30:28 EST


> > Wild thought: how about doing this for the sibling ...
> >
> > rp->nr_running += SOME_BIG_NUMBER
> >
> > when a SCHED_FIFO task starts running on some cpu, and
> > undo the above when the cpu is released. This fools
> > the load balancer into _gradually_ moving tasks off the
> > sibling, when the cpu is hogged by some SCHED_FIFO task,
> > but should have little effect if a SCHED_FIFO task takes
> > little cpu time.
>
> A good thought, and one I had considered. SOME_BIG_NUMBER needs to be
> meaninful for this to work. Ideally what we do is add the effective load from
> the sibling cpu to the pegged cpu. However that's not as useful as it sounds
> because we need to ensure both sibling runqueues are locked every time we
> check the load value of one runqueue, and the last thing I want is to
> introduce yet more locking. Also the value will vary wildly depending on
> whether the task is pegged or not, and this changes in mainline many times in
> less than .1s which means it would throw load balancing way off as the value
> will effectively become meaningless.
>

Just a few more thoughts on this....

I can't help but wonder if a similar problem exists even without HT.
What if the load-balancer decides to keep a sched_normal task on a cpu
that is being dominated by a sched_fifo task. The sched_normal task
should really be "balanced" to a different cpu but because nr_running is
the only balancing criteria that may not happen. Runqueue business
ought to be weighted by the amount of time that sched_fifo tasks on that
runqueue have recently used. So, load = rq->nr_running +
rq->recent_fifo_run_time. I think this would make load-balancing more
correct.

Now back to HT sched_domains... It seems to me that when
SD_SHARE_CPUPOWER is on, recent_fifo_run_time should apply to the whole
domain instead of a single runqueue, so that a cpu's load =
rq->nr_running + sd->recent_fifo_run_time. But I don't know if this
suffers from the same runqueue locking problem that you pointed out.

--
Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/