[tip:sched/core] sched/debug: Add sched_overutilized tracepoint

From: tip-bot for Qais Yousef
Date: Tue Jun 25 2019 - 04:28:28 EST


Commit-ID: f9f240f96efc5bcec62379eac701523e11fbb45b
Gitweb: https://git.kernel.org/tip/f9f240f96efc5bcec62379eac701523e11fbb45b
Author: Qais Yousef <qais.yousef@xxxxxxx>
AuthorDate: Tue, 4 Jun 2019 12:14:58 +0100
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Mon, 24 Jun 2019 19:23:42 +0200

sched/debug: Add sched_overutilized tracepoint

The new tracepoint allows us to track the changes in overutilized
status.

Overutilized status is associated with EAS. It indicates that the system
is in high performance state. EAS is disabled when the system is in this
state since there's not much energy savings while high performance tasks
are pushing the system to the limit and it's better to default to the
spreading behavior of the scheduler.

This tracepoint helps understanding and debugging the conditions under
which this happens.

Signed-off-by: Qais Yousef <qais.yousef@xxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Pavankumar Kondeti <pkondeti@xxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Quentin Perret <quentin.perret@xxxxxxx>
Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Uwe Kleine-Konig <u.kleine-koenig@xxxxxxxxxxxxxx>
Link: https://lkml.kernel.org/r/20190604111459.2862-6-qais.yousef@xxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
include/trace/events/sched.h | 4 ++++
kernel/sched/fair.c | 10 ++++++++--
2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index c7dd9bc7f001..420e80e56e55 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -621,6 +621,10 @@ DECLARE_TRACE(pelt_se_tp,
TP_PROTO(struct sched_entity *se),
TP_ARGS(se));

+DECLARE_TRACE(sched_overutilized_tp,
+ TP_PROTO(struct root_domain *rd, bool overutilized),
+ TP_ARGS(rd, overutilized));
+
#endif /* _TRACE_SCHED_H */

/* This part must be outside protection */
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 75218ab1fa07..11ec52709323 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5181,8 +5181,10 @@ static inline bool cpu_overutilized(int cpu)

static inline void update_overutilized_status(struct rq *rq)
{
- if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu))
+ if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu)) {
WRITE_ONCE(rq->rd->overutilized, SG_OVERUTILIZED);
+ trace_sched_overutilized_tp(rq->rd, SG_OVERUTILIZED);
+ }
}
#else
static inline void update_overutilized_status(struct rq *rq) { }
@@ -8214,8 +8216,12 @@ next_group:

/* Update over-utilization (tipping point, U >= 0) indicator */
WRITE_ONCE(rd->overutilized, sg_status & SG_OVERUTILIZED);
+ trace_sched_overutilized_tp(rd, sg_status & SG_OVERUTILIZED);
} else if (sg_status & SG_OVERUTILIZED) {
- WRITE_ONCE(env->dst_rq->rd->overutilized, SG_OVERUTILIZED);
+ struct root_domain *rd = env->dst_rq->rd;
+
+ WRITE_ONCE(rd->overutilized, SG_OVERUTILIZED);
+ trace_sched_overutilized_tp(rd, SG_OVERUTILIZED);
}
}