Re: [PATCH 0/2 v4] sched: Rewrite per entity runnable load average tracking

From: Yuyang Du
Date: Wed Jul 30 2014 - 23:19:55 EST

Next message: Joseph Salisbury: "Re: [REVERT][v3.16-rc7][STABLE] usb: xhci: Prefer endpoint context dequeue pointer over stopped_trb"
Previous message: NeilBrown: "Re: [RFC PATCH 0/2] dirreadahead system call"
In reply to: Morten Rasmussen: "Re: [PATCH 0/2 v4] sched: Rewrite per entity runnable load average tracking"
Next in thread: Morten Rasmussen: "Re: [PATCH 0/2 v4] sched: Rewrite per entity runnable load average tracking"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Morten,

On Wed, Jul 30, 2014 at 11:13:31AM +0100, Morten Rasmussen wrote:
> > > 2. runnable_load_avg and blocked_load_avg are combined
> > >
> > > runnable_load_avg currently represents the sum of load_avg_contrib of
> > > all tasks on the rq, while blocked_load_avg is the sum of those tasks
> > > not on a runqueue. It makes perfect sense to consider the sum of both
> > > when calculating the load of a cpu, but we currently don't include
> > > blocked_load_avg. The reason for that is the priority scaling of the
> > > task load_avg_contrib may lead to under-utilization of cpus that
> > > occasionally have tiny high priority task running. You can easily have a
> > > task that takes 5% of cpu time but has a load_avg_contrib several times
> > > larger than a default priority task runnable 100% of the time.
> >
> > So this is the effect of historical averaging and weight scaling, both of which
> > are just generally good, but may have bad cases.
>
> I don't agree that weight scaling is generally good. There has been
> several threads discussing that topic over the last half year or so. It
> is there to ensure smp niceness, but it makes load-balancing on systems
> which are not fully utilized sub-optimal. You may end up with some cpus
> not being fully utilized while others are over-utilized when you have
> multiple tasks running at different priorities.
>
> It is a very real problem when user-space uses priorities extensively
> like Android does. Tasks related to audio run at very high priorities
> but only for a very short amount of time, but due the to priority
> scaling their load ends up being several times higher than tasks running
> all the time at normal priority. Hence task load is a very poor
> indicator of utilization.

I understand the problem you said, but the problem is not described crystal clear.

You are saying tasks with big weight contribute too much, even they are running
short time. But is it unfair or does it lead to imbalance? It is hard to say if
not no. They have big weight, so are supposed to be "unfair" vs. small weight
tasks for the sake of fairness. In addition, since they are running short time,
their runnable weight/load is offset by this factor.

I think I am saying from pure fairness ponit of view, which is just generally good
in the sense that we can't think of a more "generally good" thing to replace it.

And you are saying when big weight task is not runnable, but already contributes
"too much" load, then leads to under utilization. So this is the matter of our
predicting algorithm. I am afraid I will say again the pridiction is generally
good. For the audio example, which is strictly periodic, it just can't be better.

FWIW, I am really not sure how serious this under utilization problem is in real
world.

I am not saying your argument does not make sense. It makes every sense from specific
case ponit from view. I do think there absolutely can be sub-optimal cases. But as
I said, I just don't think the problem description is clear enough so that we know
it is worth solving (by pros and cons comparison) and how to solve it, either
generally or specifically.

Plus, as Peter said, we have to live with user space uses big weight, and do it as
what weight is supposed to be.

Thanks,
Yuyang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Joseph Salisbury: "Re: [REVERT][v3.16-rc7][STABLE] usb: xhci: Prefer endpoint context dequeue pointer over stopped_trb"
Previous message: NeilBrown: "Re: [RFC PATCH 0/2] dirreadahead system call"
In reply to: Morten Rasmussen: "Re: [PATCH 0/2 v4] sched: Rewrite per entity runnable load average tracking"
Next in thread: Morten Rasmussen: "Re: [PATCH 0/2 v4] sched: Rewrite per entity runnable load average tracking"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]