Re: [PATCH RFC] lockdep: Update documentation for lock-class leakdetection

From: Peter Zijlstra
Date: Thu Sep 29 2011 - 09:31:14 EST


On Wed, 2011-09-28 at 11:11 -0700, Paul E. McKenney wrote:
> There are a number of bugs that can leak lock classes, which will
> eventually exhaust the maximum number (currently 8191). However,
> the documentation does not tell you how to track down the leakers.
> This commit addresses this shortcoming.
>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
>
> diff --git a/Documentation/lockdep-design.txt b/Documentation/lockdep-design.txt
> index abf768c..24bfd9f 100644
> --- a/Documentation/lockdep-design.txt
> +++ b/Documentation/lockdep-design.txt
> @@ -221,3 +221,55 @@ when the chain is validated for the first time, is then put into a hash
> table, which hash-table can be checked in a lockfree manner. If the
> locking chain occurs again later on, the hash table tells us that we
> dont have to validate the chain again.
> +
> +Troubleshooting:
> +----------------
> +
> +The validator tracks a maximum of MAX_LOCKDEP_KEYS number of lock classes.
> +Exceeding this number will trigger the following lockdep warning:
> +
> + (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))
> +
> +By default, MAX_LOCKDEP_KEYS is currently set to 8191, and typical
> +desktop systems have less than 1,000 lock classes, so this warning
> +normally results from lock-class leakage. Such leakage can result
> +from the following:
> +
> +1. Repeated module loading and unloading while running the validator.
> + The issue here is that each load of the module will create a
> + new set of lock classes for that module's locks, and module
> + unloading cannot remove old classes. Therefore, if that module
> + is loaded and unloaded repeatedly, the number of lock classes
> + will eventually reach the maximum.
> +
> +2. Dynamically allocating and freeing structures containing fields
> + of type "struct lock_class_key". Again, the fact that old
> + lock classes cannot be reused means that repeating allocation/free
> + cycles for long enough will cause the number of lock classes to
> + eventually reach the maximum.
> +

This isn't actually true, we check for keys to be in .data or .bss:

register_lock_class():
/*
* Debug-check: all keys must be persistent!
*/
if (!static_obj(lock->key)) {
debug_locks_off();
printk("INFO: trying to register non-static key.\n");
printk("the code is fine but needs lockdep annotation.\n");
printk("turning off the locking correctness validator.\n");
dump_stack();

return NULL;
}


But what can happen is that you 'accidentally' create a lot of static
locks, eg.

struct {
spinlock_t lock;
struct hlist_head hlist;
} my_hash[1 << HASH_BITS];

If you don't initialize the lock members you'll find that each will get
a separate lock class based on its static address. This can quickly
deplete the class storage.

Now really, you shouldn't ever not initialize a lock, but the above has
actually happened, although I can't find the commit atm.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/