Re: [RFC PATCH 0/6] softirq: Start pushing down the big softirq lock

From: Sebastian Andrzej Siewior
Date: Mon Aug 07 2023 - 08:50:29 EST


On 2023-08-01 15:24:35 [+0200], Frederic Weisbecker wrote:
> Networking softirqs can take time, holding the execution of block,
> tasklets, timers and RCU callbacks.
>
> People fight hard through this big softirq lock, proposing more and
> more hacks over the years to deal with the resulting fundamental
> unfairness that is not only a problem for RT users.
>
> Here is a proposal for an entrypoint to dealing with that issue in the
> long term. The purpose is to adopt a similar journey to the one we took
> with the BKL push-down but with timers. Most timers are unrelated to
> other softirq vectors, those can simply be tagged with the new
> TIMER_SOFTINTERRUPTIBLE flag that makes a callback soft-interruptible.
> The others can carry the TIMER_SOFTINTERRUPTIBLE after they get converted
> to use appropriate synchronization against other vectors callbacks
> (using spin_lock_bh() for example).

This doesn't work as proposed because of lock ordering:
|======================================================
|WARNING: possible circular locking dependency detected
|6.5.0-rc4-rt1+ #220 Not tainted
|------------------------------------------------------
|ktimers/0/15 is trying to acquire lock:
|ffff88817b41b6d8 ((softirq_ctrl.lock)){+.+.}-{2:2}, at: __local_bh_disable_ip+0xb7/0x1a0
|
|but task is already holding lock:
|ffff88817b41c820 (&base->expiry_lock){+.+.}-{2:2}, at: run_timer_softirq+0x61/0x3f0
|
|which lock already depends on the new lock.
|
|
|the existing dependency chain (in reverse order) is:
|
|-> #1 (&base->expiry_lock){+.+.}-{2:2}:
| lock_acquire+0xd4/0x2f0
| rt_spin_lock+0x21/0xf0
| run_timer_softirq+0x61/0x3f0
| __do_softirq+0x19b/0x4cb
| run_timersd+0x92/0xf0
| smpboot_thread_fn+0x211/0x330
| kthread+0x110/0x130
| ret_from_fork+0x2b/0x40
| ret_from_fork_asm+0x1b/0x30
|
|-> #0 ((softirq_ctrl.lock)){+.+.}-{2:2}:
| check_prev_add+0xe2/0xd60
| __lock_acquire+0x132d/0x1700
| lock_acquire+0xd4/0x2f0
| rt_spin_lock+0x21/0xf0
| __local_bh_disable_ip+0xb7/0x1a0
| call_timer_fn+0x172/0x310
| run_timer_softirq+0x331/0x3f0
| __do_softirq+0x19b/0x4cb
| run_timersd+0x92/0xf0
| smpboot_thread_fn+0x211/0x330
| kthread+0x110/0x130
| ret_from_fork+0x2b/0x40
| ret_from_fork_asm+0x1b/0x30
|
|other info that might help us debug this:
|
| Possible unsafe locking scenario:
|
| CPU0 CPU1
| ---- ----
| lock(&base->expiry_lock);
| lock((softirq_ctrl.lock));
| lock(&base->expiry_lock);
| lock((softirq_ctrl.lock));
|
| *** DEADLOCK ***
|
|2 locks held by ktimers/0/15:
| #0: ffffffff826e9ce0 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0x5d/0xf0
| #1: ffff88817b41c820 (&base->expiry_lock){+.+.}-{2:2}, at: run_timer_softirq+0x61/0x3f0

I posted a different series last week where I drop the lock for other
reasons at a different spot where it is safe to do so. It needs adopting
other level softirq handler (besides TIMER_SOFTIRQ) but there is no
need to deal with the individual callback.

However, how do you continue here? Assuming all timers are marked
TIMER_SOFTINTERRUPTIBLE then you could avoid the BH-lock at the
timer-softirq.
But when is a timer considered safe? Would the lack of the _bh suffix be
that and you would simply add it during your push down?
Then you continue the same thing for the remaining softirqs. And once
you are done you would remove that RT lock within local_bh_disable()?
This isn't something a !RT user would benefit, right?

The other idea I have (besides the preemption point in each softirq
handler (mentioned earlier)) is to simple drop the BH-lock on RT. Unlike
mainline, RT wouldn't deadlock then. The only that would be missing is
synchronisation against local_bh_disable() only locking for variables.
>From what I remember from the various BH-models we have in RT in the
past, that was the only thing that exploded.

Sebastian