On Tue, Mar 25, 2025 at 03:23:30PM -0400, Waiman Long wrote:
[...]
Oh, I see. And yes, the hazard pointers I proposed cannot prevent thisWhat I am saying is that register_lock_class() is trying to find a newkeyThe key stored in lockdep_key_hazptr is the one that the rest of theThat commit seemed fixing a race between disabling lockdep andI agree that the commit that I mentioned is not relevant to the current
unregistering key, and most importantly, call zap_class() for the
unregistered key even if lockdep is disabled (debug_locks = 0). It might
be related, but I'm not sure that's the reason of putting
synchronize_rcu() there. Say you want to synchronize between
/proc/lockdep and lockdep_unregister_key(), and you have
synchronize_rcu() in lockdep_unregister_key(), what's the RCU read-side
critical section at /proc/lockdep?
case. You are right that is_dynamic_key() is the only function that is
problematic, the other two are protected by the lockdep_lock. So they are
safe. Anyway, I believe that the actual race happens in the iteration of the
hashed list in is_dynamic_key(). The key that you save in the
lockdep_key_hazptr in your proposed patch should never be the key (dead_key)
function will use after is_dynamic_key() return true. That is,
CPU 0 CPU 1
===== =====
WRITE_ONCE(*lockdep_key_hazptr, key);
smp_mb();
is_dynamic_key():
hlist_for_each_entry_rcu(k, hash_head, hash_entry) {
if (k == key) {
found = true;
break;
}
}
lockdep_unregister_key():
hlist_for_each_entry_rcu(k, hash_head, hash_entry) {
if (k == key) {
hlist_del_rcu(&k->hash_entry);
found = true;
break;
}
}
smp_mb();
synchronize_lockdep_key_hazptr():
for_each_possible_cpu(cpu) {
<wait for the hazptr slot on
that CPU to be not equal to
the removed key>
}
, so that if is_dynamic_key() finds a key was in the hash, even though
later on the key would be removed by lockdep_unregister_key(), the
hazard pointers guarantee lockdep_unregister_key() would wait for the
hazard pointer to release.
that is passed to lockdep_unregister_key(). In is_dynamic_key():It is the dead_key.
hlist_for_each_entry_rcu(k, hash_head, hash_entry) {
if (k == key) {
found = true;
break;
}
}
key != k (dead_key), but before accessing its content to get to hash_entry,
an interrupt/NMI can happen. In the mean time, the structure holding the keyNo, hash_entry cannot be freed or overwritten because the user has
is freed and its content can be overwritten with some garbage. When
interrupt/NMI returns, hash_entry can point to anything leading to crash or
an infinite loop. Perhaps we can use some kind of synchronization mechanism
protect the key with hazard pointers, only when the user reset the
hazard pointer to NULL, lockdep_unregister_key() can then return and the
key can be freed.
between is_dynamic_key() and lockdep_unregister_key() to prevent this kindThe hazard pointer I proposed provides the exact synchronization ;-)
of racing. For example, we can have an atomic counter associated with each
while lockdep_unregister_key() is trying to take out an oldkey (newkey !=
oldkey). If they happens in the same hash list, register_lock_class() will
put newkey into the hazard pointer, but synchronize_lockdep_key_hazptr()
call from lockdep_unregister_key() is checking for oldkey which is not the
one saved in the hazard pointer. So lockdep_unregister_key() will return and
the key will be freed and reused while is_dynamic_key() may just have a
reference to the oldkey and trying to access its content which is invalid. I
think this is a possible scenario.
race unfortunately. (Well, technically we can still use an extra slot to
hold the key in the hash list iteration, but that would generates a lot
of stores, so it won't be ideal). But...
[...]
Your idea inspires another solution with hazard pointers, we canhead of the hashed table. is_dynamic_key() can increment the counter if it
is not zero to proceed and lockdep_unregister_key() have to make sure it can
safely decrement the counter to 0 before going ahead. Just a thought!
put the pointer of the hash_head into the hazard pointer slot ;-)
in register_lock_class():
/* hazptr: protect the key */
WRITE_ONCE(*key_hazptr, keyhashentry(lock->key));
/* Synchronizes with the smp_mb() in synchronize_lockdep_key_hazptr() */
smp_mb();
if (!static_obj(lock->key) && !is_dynamic_key(lock->key)) {
return NULL;
}
in lockdep_unregister_key():
/* Wait until register_lock_class() has finished accessing k->hash_entry. */
synchronize_lockdep_key_hazptr(keyhashentry(key));
which is more space efficient than per hash list atomic or lock unless
you have 1024 or more CPUs.