Re: PELT initial task load and wake_up_new_task()

From: Peter Zijlstra
Date: Thu Dec 17 2015 - 04:43:24 EST


On Thu, Dec 17, 2015 at 07:34:27AM +0800, Yuyang Du wrote:
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index e3266eb..3f6a8b3 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2908,10 +2908,18 @@ void remove_entity_load_avg(struct sched_entity *se)
> {
> struct cfs_rq *cfs_rq = cfs_rq_of(se);
> u64 last_update_time;
> -
> #ifndef CONFIG_64BIT
> u64 last_update_time_copy;
> +#endif
>
> + /*
> + * Newly created task or never used group entity should not be removed
> + * from its (source) cfs_rq
> + */
> + if (se->avg.last_update_time == 0)
> + return;
> +
> +#ifndef CONFIG_64BIT
> do {
> last_update_time_copy = cfs_rq->load_last_update_time_copy;
> smp_rmb();

So that ifdef stuff annoyed me a wee bit, so I did the below.

Initially I wanted to do a macro that we could use for both this and the
vruntime, but that didn't end up particularly pretty either, so I
scrapped that.

---
Subject: sched: Fix new task's load avg removed from source CPU in wake_up_new_task()
From: Yuyang Du <yuyang.du@xxxxxxxxx>
Date: Thu, 17 Dec 2015 07:34:27 +0800

If a newly created task is selected to go to a different CPU in fork
balance when it wakes up the first time, its load averages should
not be removed from the source CPU since they are never added to
it before. The same is also applicable to a never used group entity.

Fix it in remove_entity_load_avg(): when entity's last_update_time
is 0, simply return. This should precisely identify the case in
question, because in other migrations, the last_update_time is set
to 0 after remove_entity_load_avg().

Reported-by: Steve Muckle <steve.muckle@xxxxxxxxxx>
Signed-off-by: Yuyang Du <yuyang.du@xxxxxxxxx>
Cc: Patrick Bellasi <patrick.bellasi@xxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Cc: Juri Lelli <Juri.Lelli@xxxxxxx>
Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Cc: Morten Rasmussen <morten.rasmussen@xxxxxxx>
[peterz: cfs_rq_last_update_time]
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/20151216233427.GJ28098@xxxxxxxxx
---
kernel/sched/fair.c | 38 ++++++++++++++++++++++++++++----------
1 file changed, 28 insertions(+), 10 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2809,27 +2809,45 @@ dequeue_entity_load_avg(struct cfs_rq *c
max_t(s64, cfs_rq->runnable_load_sum - se->avg.load_sum, 0);
}

-/*
- * Task first catches up with cfs_rq, and then subtract
- * itself from the cfs_rq (task must be off the queue now).
- */
-void remove_entity_load_avg(struct sched_entity *se)
-{
- struct cfs_rq *cfs_rq = cfs_rq_of(se);
- u64 last_update_time;
-
#ifndef CONFIG_64BIT
+static inline u64 cfs_rq_last_update_time(struct cfs_rq *cfs_rq)
+{
u64 last_update_time_copy;
+ u64 last_update_time;

do {
last_update_time_copy = cfs_rq->load_last_update_time_copy;
smp_rmb();
last_update_time = cfs_rq->avg.last_update_time;
} while (last_update_time != last_update_time_copy);
+
+ return last_update_time;
+}
#else
- last_update_time = cfs_rq->avg.last_update_time;
+static inline u64 cfs_rq_last_update_time(struct cfs_rq *cfs_rq)
+{
+ return cfs_rq->avg.last_update_time;
+}
#endif

+/*
+ * Task first catches up with cfs_rq, and then subtract
+ * itself from the cfs_rq (task must be off the queue now).
+ */
+void remove_entity_load_avg(struct sched_entity *se)
+{
+ struct cfs_rq *cfs_rq = cfs_rq_of(se);
+ u64 last_update_time;
+
+ /*
+ * Newly created task or never used group entity should not be removed
+ * from its (source) cfs_rq
+ */
+ if (se->avg.last_update_time == 0)
+ return;
+
+ last_update_time = cfs_rq_last_update_time(cfs_rq);
+
__update_load_avg(last_update_time, cpu_of(rq_of(cfs_rq)), &se->avg, 0, 0, NULL);
atomic_long_add(se->avg.load_avg, &cfs_rq->removed_load_avg);
atomic_long_add(se->avg.util_avg, &cfs_rq->removed_util_avg);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/