Re: [patch 10/15] sched/migration: Move calc_load_migrate() into CPU_DYING

From: Peter Zijlstra
Date: Wed Jul 13 2016 - 03:50:37 EST


On Tue, Jul 12, 2016 at 06:33:56PM +0200, Thomas Gleixner wrote:

> Subject: sched/migration: Correct off by one in load migration
> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>
> The move of calc_load_migrate() from CPU_DEAD to CPU_DYING did not take into
> account that the function is now called from a thread running on the outgoing
> CPU. As a result a cpu unplug leakes a load of 1 into the global load
> accounting mechanism.
>
> Fix it by adjusting for the currently running thread which calls
> calc_load_migrate().
>
> Fixes: e9cd8fa4fcfd: "sched/migration: Move calc_load_migrate() into CPU_DYING"
> Reported-by: Anton Blanchard <anton@xxxxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

> +++ b/kernel/sched/loadavg.c
> @@ -78,11 +78,11 @@ void get_avenrun(unsigned long *loads, unsigned long offset, int shift)
> loads[2] = (avenrun[2] + offset) << shift;
> }
>
> -long calc_load_fold_active(struct rq *this_rq)
> +long calc_load_fold_active(struct rq *this_rq, long adjust)
> {
> long nr_active, delta = 0;
>
> - nr_active = this_rq->nr_running;
> + nr_active = this_rq->nr_running - adjust;
> nr_active += (long)this_rq->nr_uninterruptible;
>
> if (nr_active != this_rq->calc_load_active) {

Yeah, I think this is the only sensible approach.

How do you want to route this?