[PATCH] sched: prevent throttle in early pick_next_task_fair

From: Ben Segall
Date: Mon Apr 06 2015 - 18:28:10 EST


The first call to check_cfs_rq_runtime in pick_next_task_fair is only supposed
to trigger when cfs_rq is still an ancestor of prev. However, it was able to
trigger on tgs that had just had bandwidth toggled, because tg_set_cfs_bandwidth
set runtime_remaining to 0, and check_cfs_rq_runtime doesn't check the global
pool.

Fix this by only calling check_cfs_rq_runtime if we are still in prev's
ancestry, as evidenced by cfs_rq->curr.

Reported-by: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx>
Signed-off-by: Ben Segall <bsegall@xxxxxxxxxx>
---
kernel/sched/fair.c | 25 ++++++++++++++-----------
1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ee595ef..5cb52e9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5038,18 +5038,21 @@ again:
* entity, update_curr() will update its vruntime, otherwise
* forget we've ever seen it.
*/
- if (curr && curr->on_rq)
- update_curr(cfs_rq);
- else
- curr = NULL;
+ if (curr) {
+ if (curr->on_rq)
+ update_curr(cfs_rq);
+ else
+ curr = NULL;

- /*
- * This call to check_cfs_rq_runtime() will do the throttle and
- * dequeue its entity in the parent(s). Therefore the 'simple'
- * nr_running test will indeed be correct.
- */
- if (unlikely(check_cfs_rq_runtime(cfs_rq)))
- goto simple;
+ /*
+ * This call to check_cfs_rq_runtime() will do the
+ * throttle and dequeue its entity in the parent(s).
+ * Therefore the 'simple' nr_running test will indeed
+ * be correct.
+ */
+ if (unlikely(check_cfs_rq_runtime(cfs_rq)))
+ goto simple;
+ }

se = pick_next_entity(cfs_rq, curr);
cfs_rq = group_cfs_rq(se);
--
2.2.0.rc0.207.ga3a616c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/