Re: [tip: sched/core] sched/tracing: Don't re-read p->state when emitting sched_switch event

From: Holger Hoffstätte
Date: Mon Apr 11 2022 - 03:18:28 EST


On 2022-04-11 01:22, Holger Hoffstätte wrote:
On 2022-04-11 00:06, Qais Yousef wrote:
On 04/10/22 00:38, Qais Yousef wrote:
On 03/08/22 18:51, Qais Yousef wrote:
On 03/08/22 19:10, Greg KH wrote:
On Tue, Mar 08, 2022 at 06:02:40PM +0000, Qais Yousef wrote:
+CC stable

On 03/01/22 15:24, tip-bot2 for Valentin Schneider wrote:
The following commit has been merged into the sched/core branch of tip:

Commit-ID:     fa2c3254d7cfff5f7a916ab928a562d1165f17bb
Gitweb:        https://git.kernel.org/tip/fa2c3254d7cfff5f7a916ab928a562d1165f17bb
Author:        Valentin Schneider <valentin.schneider@xxxxxxx>
AuthorDate:    Thu, 20 Jan 2022 16:25:19
Committer:     Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Tue, 01 Mar 2022 16:18:39 +01:00

sched/tracing: Don't re-read p->state when emitting sched_switch event

As of commit

   c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu")

the following sequence becomes possible:

              p->__state = TASK_INTERRUPTIBLE;
              __schedule()
            deactivate_task(p);
   ttwu()
     READ !p->on_rq
     p->__state=TASK_WAKING
            trace_sched_switch()
              __trace_sched_switch_state()
                task_state_index()
                  return 0;

TASK_WAKING isn't in TASK_REPORT, so the task appears as TASK_RUNNING in
the trace event.

Prevent this by pushing the value read from __schedule() down the trace
event.

Reported-by: Abhijeet Dharmapurikar <adharmap@xxxxxxxxxxx>
Signed-off-by: Valentin Schneider <valentin.schneider@xxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Reviewed-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
Link: https://lore.kernel.org/r/20220120162520.570782-2-valentin.schneider@xxxxxxx

Any objection to picking this for stable? I'm interested in this one for some
Android users but prefer if it can be taken by stable rather than backport it
individually.

I think it makes sense to pick the next one in the series too.

What commit does this fix in Linus's tree?

It should be this one: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu")

Should this be okay to be picked up by stable now? I can see AUTOSEL has picked
it up for v5.15+, but it impacts v5.10 too.

commit: fa2c3254d7cfff5f7a916ab928a562d1165f17bb
subject: sched/tracing: Don't re-read p->state when emitting sched_switch event

This patch has an impact on Android 5.10 users who experience tooling breakage.
Is it possible to include in 5.10 LTS please?

It was already picked up for 5.15+ by AUTOSEL and only 5.10 is missing.


https://lore.kernel.org/stable/Yk2PQzynOVOzJdPo@xxxxxxxxx/

However, since then further investigation (still in progress) has shown that this
may have been the fault of the tool in question, so if you can verify that tracing
sched still works for you with this patch in 5.15.x then by all means
let's merge it.

So it turns out the lockup is indeed the fault of the tool, which contains multiple
kernel-version dependent tracepoint definitions and now fails with this
patch.

Greg, please re-enqueue this patch where necessary (5.10, 5.15+)
-h