Re: [PATCH] softirq: Do not loop if running under a real-time task

From: Petr Malat
Date: Thu Mar 09 2023 - 08:04:13 EST


On Wed, Mar 08, 2023 at 10:14:58AM +0100, Sebastian Andrzej Siewior wrote:
> On 2023-03-06 16:45:48 [+0100], Petr Malat wrote:
> > Softirq processing can be a source of a scheduling jitter if it executes
> > in a real-time task as in that case need_resched() is false unless there
> > is another runnable task with a higher priority. This is especially bad
> > if the softirq processing runs in a migration thread, which has priority
> > 99 and usually runs for a short time.
> >
> > One option would be to not restart the softirq processing if there is
> > another runnable task to allow the high prio task to finish and yield the
> > CPU, the second one is to not restart if softirq executes in a real-time
> > task. Usually, real-time tasks don't want to be interrupted, so implement
> > the second option.
>
> This affects only PEEMPT_RT, right?
I have observed the issue on 5.15 CONFIG_PREEMPT=y arm32 kernel.

> I have plans to redo parts of it. You shouldn't enter ksoftirqd to be
> begin with. There is this ktimerd in v6.1 which mitigates this to some
> point and I plan to extend it to also cover the sched-softirq.
> Other than that, you are right in saying that the softirq must not
> continue with a RT prio and that need_resched() is not visible here.
> However ksoftirqd itself must be able to do loops unless the
> need-resched flag is seen.
>
> Since you mentioned migration thread, how ofter to you see this or how
> does this trigger?
I have seen only one occurrence, where I have a back trace available
(from hundreds systems). I think that's because on my system it may occur
only if it hits the migration thread, otherwise there are more runable
threads of the same priority and need_resched() breaks the loop.

I obtained the stack trace by making a debugging module which uses a
periodic timer to monitor active tasks and it dumps stack when it finds
something fishy. This is what I got:
[<bf84f559>] (hogger_handler [hogger]) from [<c04850ef>] (__hrtimer_run_queues+0x13f/0x2f4)
[<c04850ef>] (__hrtimer_run_queues) from [<c04858a5>] (hrtimer_interrupt+0xc9/0x1c4)
[<c04858a5>] (hrtimer_interrupt) from [<c0810533>] (arch_timer_handler_phys+0x27/0x2c)
[<c0810533>] (arch_timer_handler_phys) from [<c046de3b>] (handle_percpu_devid_irq+0x5b/0x1e4)
[<c046de3b>] (handle_percpu_devid_irq) from [<c0469a27>] (__handle_domain_irq+0x53/0x94)
[<c0469a27>] (__handle_domain_irq) from [<c041e501>] (axxia_gic_handle_irq+0x16d/0x1bc)
[<c041e501>] (axxia_gic_handle_irq) from [<c0400ad3>] (__irq_svc+0x53/0x94)
Exception stack(0xc1595ca8 to 0xc1595cf0)
[<c0400ad3>] (__irq_svc) from [<c098e404>] (_raw_spin_unlock_irqrestore+0x1c/0x3c)
[<c098e404>] (_raw_spin_unlock_irqrestore) from [<c0446b6d>] (try_to_wake_up+0x1d9/0x5d0)
[<c0446b6d>] (try_to_wake_up) from [<c0483d2d>] (call_timer_fn+0x31/0x16c)
[<c0483d2d>] (call_timer_fn) from [<c048406f>] (run_timer_softirq+0x207/0x2d4)
[<c048406f>] (run_timer_softirq) from [<c0401293>] (__do_softirq+0xd3/0x2f8)
[<c0401293>] (__do_softirq) from [<c042876b>] (irq_exit+0x57/0x78)
[<c042876b>] (irq_exit) from [<c0469a2b>] (__handle_domain_irq+0x57/0x94)
[<c0469a2b>] (__handle_domain_irq) from [<c041e501>] (axxia_gic_handle_irq+0x16d/0x1bc)
[<c041e501>] (axxia_gic_handle_irq) from [<c0400ad3>] (__irq_svc+0x53/0x94)
Exception stack(0xc1595e78 to 0xc1595ec0)
[<c0400ad3>] (__irq_svc) from [<c044d37c>] (active_load_balance_cpu_stop+0x1ec/0x234)
[<c044d37c>] (active_load_balance_cpu_stop) from [<c04ac099>] (cpu_stopper_thread+0x69/0xd8)
[<c04ac099>] (cpu_stopper_thread) from [<c0440b53>] (smpboot_thread_fn+0x9f/0x17c)
[<c0440b53>] (smpboot_thread_fn) from [<c043ccf9>] (kthread+0x129/0x12c)
[<c043ccf9>] (kthread) from [<c0400131>] (ret_from_fork+0x11/0x20)

I was then looking into the code how it could happen softirqs were not
offloaded to the thread and the only explanation I have is what I described
in the original mail.
BR,
Petr