Re: [PATCH] lockdep: Speed up lockdep_unregister_key() with expedited RCU synchronization

From: Waiman Long
Date: Mon Mar 31 2025 - 17:47:56 EST

Next message: Joanne Koong: "Re: [PATCH 2/2] fuse: Adjust readdir() buffer to requesting buffer size."
Previous message: Jeff Johnson: "Re: [PATCH ath-next v3 0/6] wifi: ath11k: bring hibernation support back"
In reply to: Boqun Feng: "Re: [PATCH] lockdep: Speed up lockdep_unregister_key() with expedited RCU synchronization"
Next in thread: Eric Dumazet: "Re: [PATCH] lockdep: Speed up lockdep_unregister_key() with expedited RCU synchronization"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 3/31/25 5:21 PM, Boqun Feng wrote:

On Mon, Mar 31, 2025 at 02:57:20PM -0400, Waiman Long wrote:

On 3/31/25 2:33 PM, Paul E. McKenney wrote:

On Mon, Mar 31, 2025 at 01:33:22PM -0400, Waiman Long wrote:

On 3/31/25 1:26 PM, Boqun Feng wrote:

On Wed, Mar 26, 2025 at 11:39:49AM -0400, Waiman Long wrote:
[...]

Anyway, that may work. The only problem that I see is the issue of nesting
of an interrupt context on top of a task context. It is possible that the
first use of a raw_spinlock may happen in an interrupt context. If the
interrupt happens when the task has set the hazard pointer and iterating the
hash list, the value of the hazard pointer may be overwritten. Alternatively
we could have multiple slots for the hazard pointer, but that will make the
code more complicated. Or we could disable interrupt before setting the
hazard pointer.

Or we can use lockdep_recursion:

preempt_disable();
lockdep_recursion_inc();
barrier();

WRITE_ONCE(*hazptr, ...);

, it should prevent the re-entrant of lockdep in irq.

That will probably work. Or we can disable irq. I am fine with both.

Disabling irq may not work in this case, because an NMI can also happen
and call register_lock_class().

Right, disabling irq doesn't work with NMI. So incrementing the recursion
count is likely the way to go and I think it will work even in the NMI case.

I'm experimenting a new idea here, it might be better (for general
cases), and this has the similar spirit that we could move the
protection scope of a hazard pointer from a key to a hash_list: we can
introduce a wildcard address, and whenever we do a synchronize_hazptr(),
if the hazptr slot equal to wildcard, we treat as it matches to any ptr,
hence synchronize_hazptr() will still wait until it's zero'd. Not only
this could help in the nesting case, it can also be used if the users
want to protect multiple things with this simple hazard pointer
implementation.

I think it is a good idea to add a wildcard for the general use case.
Setting the hazptr to the list head will be enough for this particular case.

Careful! If we enable use of wildcards outside of the special case
of synchronize_hazptr(), we give up the small-memory-footprint advantages
of hazard pointers. You end up having to wait on all hazard-pointer
readers, which was exactly why RCU was troublesome here. ;-)

Technically, only the hazard-pointer readers that have switched to
wildcard mode because multiple hazptr critical sections ;-)

If the plan is to have one global set of hazard pointers for all the

A global set of hazard pointers for all the possible use cases is the
current plan (at least it should be when we have fully-featured hazptr
[1]). Because the hazard pointer value already points the the data to
protect, so no need to group things into "domain"s.

possible use cases, supporting wildcard may be a problem. If we allow

I had some off-list discussions with Paul, and I ended up with the idea
of user-specific wildcard (i.e. different users can have different
wildcards) + one global set of hazard pointers. However, it just occured
to me that it won'd quite work in this simple hazard pointer
implementation (one slot per-CPU) :( Because you can have a user A's
hazptr critical interrupted by a user B's interrupt handler, and if both
A & B are using customized wildcard but they don't know each other, it's
not going to work by setting either wildcard value into the slot.

To make it clear for the discussion, we have two hazard pointer
implementations:

1. The fully-featured one [1], which allow users to provide memory for
hazptr slots, so no issue about nesting/re-entry etc. And wildcard
doesn't make sense in this implemenation.

2. The simple variant, which is what I've proposed in this thread, and
since it only has one slot per CPU, either all the users need to
prevent the re-entries or we need a global wildcard. Also the readers
of the simple variant need to disable preemption regardlessly because
it only has one hazptr slot to use. That means its read-side critical
section should be short usually.

I could try to use the fully-featured one in lockdep, what I need to do
is creating enough hazard_context so we have enough slots for lockdep
and may or may not need lockdep_recursion to prevent reentries. However,
I still believe (or I don't have data to show otherwise) that the simple
variant with one slot per CPU + global wildcard will work fine in
practice.

So what I would like to do is introducing the simple variant as a
general API with a global wildcard (because without it, it cannot be a
general API because one user have to prevent entering another user's
critical section), and lockdep can use it. And we can monitor the
delay of synchronize_shazptr() and if wildcard becomes a problem, move
to a fully-featured hazptr implementation. Sounds like a plan?

[1]: https://lore.kernel.org/lkml/20240917143402.930114-2-boqun.feng@xxxxxxxxx/

Thank for the detailed explanation. I am looking forward to your new hazptr patch series.

Cheers,
Longman

Regards,
Boqun

different sets of hazard pointers for different use cases, it will be less
an issue. Anyway, maybe we should skip wildcard for the current case so that
we have more time to think through it first.

Cheers,
Longman

Next message: Joanne Koong: "Re: [PATCH 2/2] fuse: Adjust readdir() buffer to requesting buffer size."
Previous message: Jeff Johnson: "Re: [PATCH ath-next v3 0/6] wifi: ath11k: bring hibernation support back"
In reply to: Boqun Feng: "Re: [PATCH] lockdep: Speed up lockdep_unregister_key() with expedited RCU synchronization"
Next in thread: Eric Dumazet: "Re: [PATCH] lockdep: Speed up lockdep_unregister_key() with expedited RCU synchronization"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]