Re: 4.3 group scheduling regression

From: Yuyang Du
Date: Mon Oct 12 2015 - 23:44:20 EST


On Mon, Oct 12, 2015 at 12:23:31PM +0200, Mike Galbraith wrote:
> On Mon, 2015-10-12 at 10:12 +0800, Yuyang Du wrote:
>
> > I am guessing it is in calc_tg_weight(), and naughty boys do make them more
> > favored, what a reality...
> >
> > Mike, beg you test the following?
>
> Wow, that was quick. Dinky patch made it all better.
>
> -----------------------------------------------------------------------------------------------------------------
> Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | Maximum delay at |
> -----------------------------------------------------------------------------------------------------------------
> oink:(8) | 739056.970 ms | 27270 | avg: 2.043 ms | max: 29.105 ms | max at: 339.988310 s
> mplayer:(25) | 36448.997 ms | 44670 | avg: 1.886 ms | max: 72.808 ms | max at: 302.153121 s
> Xorg:988 | 13334.908 ms | 22210 | avg: 0.081 ms | max: 25.005 ms | max at: 269.068666 s
> testo:(9) | 2558.540 ms | 13703 | avg: 0.124 ms | max: 6.412 ms | max at: 279.235272 s
> konsole:1781 | 1084.316 ms | 1457 | avg: 0.006 ms | max: 1.039 ms | max at: 268.863379 s
> kwin:1734 | 879.645 ms | 17855 | avg: 0.458 ms | max: 15.788 ms | max at: 268.854992 s
> pulseaudio:1808 | 356.334 ms | 15023 | avg: 0.028 ms | max: 6.134 ms | max at: 324.479766 s
> threaded-ml:3483 | 292.782 ms | 25769 | avg: 0.364 ms | max: 40.387 ms | max at: 294.550515 s
> plasma-desktop:1745 | 265.055 ms | 1470 | avg: 0.102 ms | max: 21.886 ms | max at: 267.724902 s
> perf:3439 | 61.677 ms | 2 | avg: 0.117 ms | max: 0.232 ms | max at: 367.043889 s

Phew...

I think maybe the real disease is the tg->load_avg is not updated in time.
I.e., it is after migrate, the source cfs_rq does not decrease its contribution
to the parent's tg->load_avg fast enough.

--

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4df37a4..3dba883 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2686,12 +2686,13 @@ static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq);
static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
{
struct sched_avg *sa = &cfs_rq->avg;
- int decayed;
+ int decayed, updated = 0;

if (atomic_long_read(&cfs_rq->removed_load_avg)) {
long r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
sa->load_avg = max_t(long, sa->load_avg - r, 0);
sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0);
+ updated = 1;
}

if (atomic_long_read(&cfs_rq->removed_util_avg)) {
@@ -2708,7 +2709,7 @@ static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
cfs_rq->load_last_update_time_copy = sa->last_update_time;
#endif

- return decayed;
+ return decayed | updated;
}

/* Update task and its cfs_rq load average */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/