Re: [PATCH] lockdep: Speed up lockdep_unregister_key() with expedited RCU synchronization
From: Eric Dumazet
Date: Mon Mar 24 2025 - 08:24:13 EST
On Mon, Mar 24, 2025 at 1:12 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Fri, Mar 21, 2025 at 02:30:49AM -0700, Breno Leitao wrote:
> > lockdep_unregister_key() is called from critical code paths, including
> > sections where rtnl_lock() is held. For example, when replacing a qdisc
> > in a network device, network egress traffic is disabled while
> > __qdisc_destroy() is called for every network queue.
> >
> > If lockdep is enabled, __qdisc_destroy() calls lockdep_unregister_key(),
> > which gets blocked waiting for synchronize_rcu() to complete.
> >
> > For example, a simple tc command to replace a qdisc could take 13
> > seconds:
> >
> > # time /usr/sbin/tc qdisc replace dev eth0 root handle 0x1: mq
> > real 0m13.195s
> > user 0m0.001s
> > sys 0m2.746s
> >
> > During this time, network egress is completely frozen while waiting for
> > RCU synchronization.
> >
> > Use synchronize_rcu_expedited() instead to minimize the impact on
> > critical operations like network connectivity changes.
> >
> > This improves 10x the function call to tc, when replacing the qdisc for
> > a network card.
> >
> > # time /usr/sbin/tc qdisc replace dev eth0 root handle 0x1: mq
> > real 0m1.789s
> > user 0m0.000s
> > sys 0m1.613s
> >
> > Reported-by: Erik Lundgren <elundgren@xxxxxxxx>
> > Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
> > Reviewed-by: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
> > ---
> > kernel/locking/lockdep.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> > index 4470680f02269..a79030ac36dd4 100644
> > --- a/kernel/locking/lockdep.c
> > +++ b/kernel/locking/lockdep.c
> > @@ -6595,8 +6595,10 @@ void lockdep_unregister_key(struct lock_class_key *key)
> > if (need_callback)
> > call_rcu(&delayed_free.rcu_head, free_zapped_rcu);
> >
> > - /* Wait until is_dynamic_key() has finished accessing k->hash_entry. */
> > - synchronize_rcu();
> > + /* Wait until is_dynamic_key() has finished accessing k->hash_entry.
> > + * This needs to be quick, since it is called in critical sections
> > + */
> > + synchronize_rcu_expedited();
> > }
> > EXPORT_SYMBOL_GPL(lockdep_unregister_key);
>
> So I fundamentally despise synchronize_rcu_expedited(), also your
> comment style is broken.
>
> Why can't qdisc call this outside of the lock?
Good luck with that, and anyway the time to call it 256 times would
still hurt Breno use case.
My suggestion was to change lockdep_unregister_key() contract, and use
kfree_rcu() there
> I think we should redesign lockdep_unregister_key() to work on a separately
> allocated piece of memory,
> then use kfree_rcu() in it.
>
> Ie not embed a "struct lock_class_key" in the struct Qdisc, but a pointer to
>
> struct ... {
> struct lock_class_key key;
> struct rcu_head rcu;
> }
More work because it requires changing all lockdep_unregister_key() users.