On Mon, Mar 31, 2025 at 01:33:22PM -0400, Waiman Long wrote:
On 3/31/25 1:26 PM, Boqun Feng wrote:Careful! If we enable use of wildcards outside of the special case
On Wed, Mar 26, 2025 at 11:39:49AM -0400, Waiman Long wrote:Right, disabling irq doesn't work with NMI. So incrementing the recursion
[...]
Disabling irq may not work in this case, because an NMI can also happenThat will probably work. Or we can disable irq. I am fine with both.Anyway, that may work. The only problem that I see is the issue of nestingOr we can use lockdep_recursion:
of an interrupt context on top of a task context. It is possible that the
first use of a raw_spinlock may happen in an interrupt context. If the
interrupt happens when the task has set the hazard pointer and iterating the
hash list, the value of the hazard pointer may be overwritten. Alternatively
we could have multiple slots for the hazard pointer, but that will make the
code more complicated. Or we could disable interrupt before setting the
hazard pointer.
preempt_disable();
lockdep_recursion_inc();
barrier();
WRITE_ONCE(*hazptr, ...);
, it should prevent the re-entrant of lockdep in irq.
and call register_lock_class().
count is likely the way to go and I think it will work even in the NMI case.
I'm experimenting a new idea here, it might be better (for generalI think it is a good idea to add a wildcard for the general use case.
cases), and this has the similar spirit that we could move the
protection scope of a hazard pointer from a key to a hash_list: we can
introduce a wildcard address, and whenever we do a synchronize_hazptr(),
if the hazptr slot equal to wildcard, we treat as it matches to any ptr,
hence synchronize_hazptr() will still wait until it's zero'd. Not only
this could help in the nesting case, it can also be used if the users
want to protect multiple things with this simple hazard pointer
implementation.
Setting the hazptr to the list head will be enough for this particular case.
of synchronize_hazptr(), we give up the small-memory-footprint advantages
of hazard pointers. You end up having to wait on all hazard-pointer
readers, which was exactly why RCU was troublesome here. ;-)