On 3/26/25 1:25 AM, Boqun Feng wrote:If we are using hazard pointer for synchronization, should we also take off "_rcu" from the list iteration/insertion/deletion macros to avoid the confusion that RCU is being used?
Providing a second API and ask callers to do the right thing is probably not a good idea and mistake is going to be made sooner or later.It looks like you are trying hard to find a use case for hazard pointer inWell, if it does the job, why not use it 😉 Also this shows how
the kernel 🙂
flexible hazard pointers can be.
At least when using hazard pointers, the reader side of the hash list
iteration is still lockless. Plus, since the synchronization part
doesn't need to wait for the RCU readers in the whole system, it will be
faster (I tried with the protecting-the-whole-hash-list approach as
well, it's the same result on the tc command). This is why I choose to
look into hazard pointers. Another mechanism can achieve the similar
behavior is SRCU, but SRCU is slightly heavier compared to hazard
pointers in this case (of course SRCU has more functionalities).
We can provide a lockdep_unregister_key_nosync() without the
synchronize_rcu() in it and let users do the synchronization, but it's
going to be hard to enforce and review, especially when someone
refactors the code and move the free code to somewhere else.
That will probably work. Or we can disable irq. I am fine with both.Anyway, that may work. The only problem that I see is the issue of nestingOr we can use lockdep_recursion:
of an interrupt context on top of a task context. It is possible that the
first use of a raw_spinlock may happen in an interrupt context. If the
interrupt happens when the task has set the hazard pointer and iterating the
hash list, the value of the hazard pointer may be overwritten. Alternatively
we could have multiple slots for the hazard pointer, but that will make the
code more complicated. Or we could disable interrupt before setting the
hazard pointer.
preempt_disable();
lockdep_recursion_inc();
barrier();
WRITE_ONCE(*hazptr, ...);
, it should prevent the re-entrant of lockdep in irq.
The solution that I am thinking about is to have a simple unfair rwlock toThis could work, one thing though is that locks don't compose. Using a
protect just the hash list iteration. lockdep_unregister_key() and
lockdep_register_key() take the write lock with interrupt disabled. While
is_dynamic_key() takes the read lock. Nesting in this case isn't a problem
and we don't need RCU to protect the iteration process and so the last
synchronize_rcu() call isn't needed. The level of contention should be low
enough that live lock isn't an issue.
hash write_lock in lockdep_unregister_key() will create a lockdep_lock()
-> "hash write_lock" dependency, and that means you cannot
lockdep_lock() while you're holding a hash read_lock, although it's
not the case today, but it certainly complicates the locking design
inside lockdep where there's no lockdep to help 😉
Thinking about it more, doing it in a lockless way is probably a good idea.