[PATCH][V2] sched/rt: Print curr when RT throttling activated

From: Alexandru Ardelean
Date: Tue May 16 2023 - 08:22:12 EST


From: Xianting Tian <tian.xianting@xxxxxxx>

We may meet the issue, that one RT thread occupied the cpu by 950ms/1s,
The RT thread maybe is a business thread or other unknown thread.

Currently, it only outputs the print "sched: RT throttling activated"
when RT throttling happen. It is hard to know what is the RT thread,
For further analysis, we need add more prints.

This patch is to print current RT task when RT throttling activated,
It help us to know what is the RT thread in the first time.

Tested-by: Alexandru Ardelean <alex@xxxxxxxxxxx>
Signed-off-by: Xianting Tian <tian.xianting@xxxxxxx>
---

Initial patch submission:
https://lore.kernel.org/all/f3265adc26d4416dacf157f61fa60ad6@xxxxxxx/T/

We've been having some issues of our own with some applications + some RT
configuration == some threads endded up taking too much CPU time.
This patch came in quite in handy to see in logs which thread is more
problematic.

We've applied this patch onto a 5.10.116 tree. It did need a bit of
re-applying since some context has changed since the initial version (hence
the V2 tag).
Since 5.10.116 (where we used it), it applied nicely to the latest/current
linux-next tree (hence the Tested-by tag).

It would be nice to apply this to the mainline kernel and have this handy.

kernel/sched/rt.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 00e0e5074115..44b161e42733 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -995,7 +995,7 @@ static inline int rt_se_prio(struct sched_rt_entity *rt_se)
return rt_task_of(rt_se)->prio;
}

-static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq)
+static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq, struct task_struct *curr)
{
u64 runtime = sched_rt_runtime(rt_rq);

@@ -1019,7 +1019,8 @@ static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq)
*/
if (likely(rt_b->rt_runtime)) {
rt_rq->rt_throttled = 1;
- printk_deferred_once("sched: RT throttling activated\n");
+ printk_deferred_once("sched: RT throttling activated (curr: pid %d, comm %s)\n",
+ curr->pid, curr->comm);
} else {
/*
* In case we did anyway, make it go away,
@@ -1074,7 +1075,7 @@ static void update_curr_rt(struct rq *rq)
if (sched_rt_runtime(rt_rq) != RUNTIME_INF) {
raw_spin_lock(&rt_rq->rt_runtime_lock);
rt_rq->rt_time += delta_exec;
- exceeded = sched_rt_runtime_exceeded(rt_rq);
+ exceeded = sched_rt_runtime_exceeded(rt_rq, curr);
if (exceeded)
resched_curr(rq);
raw_spin_unlock(&rt_rq->rt_runtime_lock);
--
2.40.1