Re: [PATCH sched/core] sched/rt: Fix RT_PUSH_IPI soft lockup loop
From: Steven Rostedt
Date: Wed May 13 2026 - 21:31:46 EST
On Wed, 13 May 2026 14:53:21 -1000
Tejun Heo <tj@xxxxxxxxxx> wrote:
> > This still doesn't explain to me why the current process is of a lower
> > priority than a waiting RT task.
>
> 1. The CPU was running a fair task.
>
> 2. IRQ triggers which creates softirq work.
>
> 3. Either IRQ, softirq or another CPU wakes up multiple RT tasks to the CPU.
>
> 4. The CPU enters softirq.
OK, this is what I was missing. The fact that the CPU was running a
softirq at the time that was running for a very long time that prevents
the schedule from happening.
>
> 5. Other CPUs keep sending pull IPIs, slowing softirq processing.
>
> 6. Before softirq processing finishes, another IRQ happens which creates
> more softirq work. Go back to 4.
>
> > I'm really starting to think you are fixing a symptom and not the cause.
>
> It seems relatively straightforward to me. The CPU was relatively loaded
> with irq/softirq. While in irq context, RT tasks wake up to it and then the
> CPU gets hammered by pull IPIs to the point where it's constantly chasing
> new softirq work and thus can't leave irq context in a reasonable amount of
> time. What am I missing?
So if the current task running is SCHED_OTHER we still need to handle
the case where the next task is pinned, as it will cause a warning
again if it tries to move the fair task, especially since that doesn't
fix the overloading.
I think this requires a bit more complex fix. Perhaps if the current
task is fair and the next task is pinned, it needs to look for the task
after that one to move.
-- Steve