Re: 4.3 group scheduling regression

From: Yuyang Du
Date: Tue Oct 13 2015 - 04:24:35 EST


On Tue, Oct 13, 2015 at 10:06:48AM +0200, Peter Zijlstra wrote:
> On Tue, Oct 13, 2015 at 03:55:17AM +0800, Yuyang Du wrote:
>
> > I think maybe the real disease is the tg->load_avg is not updated in time.
> > I.e., it is after migrate, the source cfs_rq does not decrease its contribution
> > to the parent's tg->load_avg fast enough.
>
> No, using the load_avg for shares calculation seems wrong; that would
> mean we'd first have to ramp up the avg before you react.
>
> You want to react quickly to actual load changes, esp. going up.
>
> We use the avg to guess the global group load, since that's the best
> compromise we have, but locally it doesn't make sense to use the avg if
> we have the actual values.

In Mike's case, since the mplayer group has only one active task, after
the task migrates, the source cfs_rq should have zero contrib to the
tg, so at the destination, the group entity should have the entire tg's
share. It is just the zeroing can be that fast we need.

But yes, in a general case, the load_avg (that has the blocked load) is
likely to lag behind. Using the actual load.weight to accelerate the
process makes sense. It is especially helpful to the less hungry tasks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/