Re: [PATCH] sched/rt: fix incorrect schedstats for rt thread

From: Peter Zijlstra

Date: Mon Jan 12 2026 - 11:38:26 EST


On Fri, Jan 09, 2026 at 03:24:47PM +0800, Dengjun Su wrote:

> For __update_stats_wait_end(), task_on_rq_migrating(p) is needed to
> distinguish between stage 2 and stage 4 because they involve different
> processing flows, but for __update_stats_wait_start(), it is not necessary
> to distinguish between stage 1 and stage 3.
>
> As for adding the condition wait_start > prev_wait_start, I think it is
> more like a mechanism to prevent statistical deviations caused by time
> inconsistencies.

It looks like nonsense to me.. since you have a test-case, could you see
what this does for you?

Specifically:

- it ensures that when not in a migration, prev_wait_start must be 0

- it unconditionally subtracts; unsigned types are defined to wrap
nicely (2s complement) and it all should work just fine.

---

diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
index d1c9429a4ac5..144b23029327 100644
--- a/kernel/sched/stats.c
+++ b/kernel/sched/stats.c
@@ -12,8 +12,10 @@ void __update_stats_wait_start(struct rq *rq, struct task_struct *p,
wait_start = rq_clock(rq);
prev_wait_start = schedstat_val(stats->wait_start);

- if (p && likely(wait_start > prev_wait_start))
+ if (p) {
+ WARN_ON_ONCE(!task_on_rq_migrating(p) && prev_wait_start);
wait_start -= prev_wait_start;
+ }

__schedstat_set(stats->wait_start, wait_start);
}