Re: sched: hang in migrate_swap

From: Kirill Tkhai
Date: Thu Apr 10 2014 - 09:46:30 EST


10.04.2014, 11:00, "Michael wang" <wangyun@xxxxxxxxxxxxxxxxxx>:
> On 04/10/2014 11:31 AM, Sasha Levin wrote:
> [snip]
>
>>  I'd like to re-open this issue. It seems that something broke and I'm
>>  now seeing the same issues that have gone away 2 months with this patch
>>  again.
>
> A new mechanism has been designed to move the priority checking inside
> idle_balance(), including Kirill who is the designer ;-)

Not sure, it's connected with my patch. But looks like, we forgot
exactly stop class. Maybe this will help?

[PATCH] sched: Checking for stop task appearance when balancing happens

Just do it, like we do for other higher priority classes...

Signed-off-by: Kirill Tkhai <tkhai@xxxxxxxxx>
---
kernel/sched/deadline.c | 11 ++++++++++-
kernel/sched/fair.c | 3 ++-
kernel/sched/rt.c | 7 ++++---
3 files changed, 16 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 27ef409..b080957 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1021,8 +1021,17 @@ struct task_struct *pick_next_task_dl(struct rq *rq, struct task_struct *prev)

dl_rq = &rq->dl;

- if (need_pull_dl_task(rq, prev))
+ if (need_pull_dl_task(rq, prev)) {
pull_dl_task(rq);
+ /*
+ * pull_rt_task() can drop (and re-acquire) rq->lock; this
+ * means a stop task can slip in, in which case we need to
+ * re-start task selection.
+ */
+ if (rq->stop && rq->stop->on_rq)
+ return RETRY_TASK;
+ }
+
/*
* When prev is DL, we may throttle it in put_prev_task().
* So, we update time before we check for dl_nr_running.
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7e9bd0b..c50275b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6727,7 +6727,8 @@ static int idle_balance(struct rq *this_rq)
out:
/* Is there a task of a high priority class? */
if (this_rq->nr_running != this_rq->cfs.h_nr_running &&
- (this_rq->dl.dl_nr_running ||
+ ((this_rq->stop && this_rq->stop->on_rq) ||
+ this_rq->dl.dl_nr_running ||
(this_rq->rt.rt_nr_running && !rt_rq_throttled(&this_rq->rt))))
pulled_task = -1;

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index d8cdf16..bd2267a 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1362,10 +1362,11 @@ pick_next_task_rt(struct rq *rq, struct task_struct *prev)
pull_rt_task(rq);
/*
* pull_rt_task() can drop (and re-acquire) rq->lock; this
- * means a dl task can slip in, in which case we need to
- * re-start task selection.
+ * means a dl or stop task can slip in, in which case we need
+ * to re-start task selection.
*/
- if (unlikely(rq->dl.dl_nr_running))
+ if (unlikely((rq->stop && rq->stop->on_rq) ||
+ rq->dl.dl_nr_running))
return RETRY_TASK;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/