Re: Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")

From: Vincent Guittot
Date: Thu Apr 26 2018 - 06:31:44 EST


Hi Niklas,

Le Thursday 26 Apr 2018 à 00:56:03 (+0200), Niklas Söderlund a écrit :
> Hi Vincent,
>
> Here are the result, sorry for the delay.
>
> On 2018-04-23 11:54:20 +0200, Vincent Guittot wrote:
>
> [snip]
>
> >
> > Thanks for the report. Can you re run with the following trace-cmd sequence ? My previous sequence disables ftrace events
> >
> > trace-cmd reset > /dev/null
> > trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > trace-cmd start -b 40000 -p function -l dump_backtrace -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> >
> > I have updated the patch and added traces to check that scheduler returns from idle_balance function and doesn't stay stuck
>
> Once more I applied the change bellow on-top of c18bb396d3d261eb ("Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
>
> This time the result of 'trace-cmd report' is so large I do not include
> it here, but I attach the trace.dat file. Not sure why but the timing of
> sending the NMI to the backtrace print is different (but content the
> same AFIK) so in the odd change it can help figure this out:
>

Thanks for the trace, I have been able to catch a problem with it.
Could you test the patch below to confirm that the problem is solved ?
The patch apply on-top of
c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")

From: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Date: Thu, 26 Apr 2018 12:19:32 +0200
Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
we release the rq->lock when updating blocked load of idle CPUs. This open
a time window during which another CPU can add a task to this CPU's cfs_rq.
The check for newly added task of idle_balance() is not in the common path.
Move the out label to include this check.

Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
Reported-by: Heiner Kallweit <hkallweit1@xxxxxxxxx>
Reported-by: Niklas Söderlund <niklas.soderlund@xxxxxxxxxxxx>
Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0951d1c..15a9f5e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
if (curr_cost > this_rq->max_idle_balance_cost)
this_rq->max_idle_balance_cost = curr_cost;

+out:
/*
* While browsing the domains, we released the rq lock, a task could
* have been enqueued in the meantime. Since we're not going idle,
@@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
if (this_rq->cfs.h_nr_running && !pulled_task)
pulled_task = 1;

-out:
/* Move the next balance forward */
if (time_after(this_rq->next_balance, next_balance))
this_rq->next_balance = next_balance;
--
2.7.4



[snip]