[RFC tip/locking/lockdep v6 11/20] lockdep: Fix recursive read lock related safe->unsafe detection

From: Boqun Feng
Date: Wed Apr 11 2018 - 09:48:01 EST

Next message: Boqun Feng: "[RFC tip/locking/lockdep v6 13/20] lockdep/selftest: Add a R-L/L-W test case specific to chain cache behavior"
Previous message: Boqun Feng: "[RFC tip/locking/lockdep v6 10/20] lockdep: Adjust check_redundant() for recursive read change"
In reply to: Boqun Feng: "[RFC tip/locking/lockdep v6 10/20] lockdep: Adjust check_redundant() for recursive read change"
Next in thread: Boqun Feng: "[RFC tip/locking/lockdep v6 13/20] lockdep/selftest: Add a R-L/L-W test case specific to chain cache behavior"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

irq-safe -> irq-unsafe lock dependency is illegal, because it could
introduce the "irq inversion" problem, that is when a irq-unsafe lock
is held, some one else interrupts and tries to acquire irq-safe lock.
And that case adds a temporary from irq-unsafe to irq-safe, as a result,
deadlock.

There are four cases for irq inversion deadlocks:

(-(X..Y)-> means a strong dependency path starts with a --(X*)-->
dependency and ends with a --(*Y)-- dependency.)

1. An irq-safe lock L1 has a dependency -> .. -> to an
irq-unsafe lock L2.

2. An irq-read-safe lock L1 has a dependency -(N*)-> .. -> to an
irq-unsafe lock L2.

3. An irq-safe lock L1 has a dependency -> .. -(*N)-> to an
irq-read-unsafe lock L2.

4. An irq-read-safe lock L1 has a dependency -(N*)-> .. -(*N)-> to
an irq-read-unsafe lock L2.

The current check_usage() only checks 1) and 2), with the enhanced
__bfs(), we could combine all the four in find_usage_{back,for}wards(),
by using another match function. The idea is, if we are in dependency
path L1 -> .. -> L2, and the temporary irq inversion dependency for
unsafe to safe is L2 -> L1. We need a strong dependency path/circle for
L1 -> .. -> L2 -> L1 to prove it's a deadlock. So that we need either L2
-> L1 is *N or L1 -> .. -> L2 is N*, that means either L1 is
irq-(write-)safe lock or backwards BFS search finds the ->only_xr is
false for L1. Like wise for L2. usage_match() is updated for exactly
this logic.

Moveover, with the new find_usage_{back,for}wards(), mark_lock_irq() can
be simplified and since self inversion case is already covered, we can
further kill valid_state().

Another thing is that since we now find the inversion in one search, and
__bfs() only report whether a match is found or not, we are then lack of
the information for print_bad_irq_dependency(), so we introduce a new
function match_bit(), which returns the matched usage bit in __bfs(),
and it must be called after we find a match.

Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx>
---
Peter,

I went further than our discussion:

https://marc.info/?l=linux-kernel&m=151938583827331

, I found we can actually only need one backwards BFS and one forwards
BFS for check_irq_usage(), there is no need for two check_usage()s
there, so I open-coded check_usage() into check_irq_usage() and cleaned
other things.

Given that I added quite a few lines of comments, so the diff actually
shows we save a few lines of code.

kernel/locking/lockdep.c | 299 ++++++++++++++++++++++++++---------------------
1 file changed, 163 insertions(+), 136 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 6135d1836ed3..18eb76189727 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -461,6 +461,26 @@ static const char *usage_str[] =
[LOCK_USED] = "INITIAL USE",
};

+static const char * const state_names[] = {
+#define LOCKDEP_STATE(__STATE) \
+ __stringify(__STATE),
+#include "lockdep_states.h"
+#undef LOCKDEP_STATE
+};
+
+static const char * const state_rnames[] = {
+#define LOCKDEP_STATE(__STATE) \
+ __stringify(__STATE)"-RR",
+#include "lockdep_states.h"
+#undef LOCKDEP_STATE
+};
+
+static inline const char *state_name(enum lock_usage_bit bit)
+{
+ return (bit & 1) ? state_rnames[bit >> 2] : state_names[bit >> 2];
+}
+
+
const char * __get_key_name(struct lockdep_subclass_key *key, char *str)
{
return kallsyms_lookup((unsigned long)key, NULL, NULL, NULL, str);
@@ -1542,11 +1562,44 @@ check_redundant(struct lock_list *root, struct held_lock *target,
* Forwards and backwards subgraph searching, for the purposes of
* proving that two subgraphs can be connected by a new dependency
* without creating any illegal irq-safe -> irq-unsafe lock dependency.
+ *
+ * irq-safe -> irq-unsafe lock dependency is illegal, because it could happen
+ * that when irq-unsafe lock L2 is held, an interrupt happens and the
+ * corresponding handler tries to acquire irq-safe lock L1, and that creates
+ * a temporary dependency L2 -> L1, and since we already find a dependency from
+ * L1 to L2, which means we have a cirlce L2 -> L1 -> .. -> L2. But note that
+ * the circle has to be a strong path for a deadlock, so we need to rule out
+ * case where 1) L1 -> .. -> L2 is R* and L1 is only irq-read-safe lock and 2)
+ * L1 -> .. -> L2 is *R and L2 is only irq-read-unsafe lock.
*/

+/*
+ * The match function for usage checks.
+ *
+ * As above, if in the BFS, entry->only_xr is true (means L1 -> .. is R* in
+ * backwards searching or .. -> L2 is *R in forwards searching), then read
+ * usage of @entry doesn't count as strong. So we only test read usage if
+ * entry->only_xr is false.
+ *
+ * In other words, if we find a write usage (either irq-safe or irq-unsafe), we
+ * don't need to care about the type of the path (i.e. we need to care
+ * ->only_xr). Otherwise, ->only_xr has to be false, and we also need a read
+ * usage.
+ */
static inline bool usage_match(struct lock_list *entry, void *bit)
{
- return entry->class->usage_mask & (1 << (enum lock_usage_bit)bit);
+ enum lock_usage_bit ub = (enum lock_usage_bit)bit;
+ unsigned long mask;
+ unsigned long read_mask;
+ unsigned long usage;
+
+ ub &= ~1; /* remove the read usage bit */
+ mask = 1UL << ub;
+ read_mask = 1UL << (ub + 1);
+ usage = entry->class->usage_mask;
+
+ return (usage & mask) || /* write usage works with any path */
+ (!entry->only_xr && (usage & read_mask)); /* read usage only works with *N path */
}

@@ -1708,8 +1761,7 @@ print_bad_irq_dependency(struct task_struct *curr,
struct held_lock *prev,
struct held_lock *next,
enum lock_usage_bit bit1,
- enum lock_usage_bit bit2,
- const char *irqclass)
+ enum lock_usage_bit bit2)
{
if (!debug_locks_off_graph_unlock() || debug_locks_silent)
return 0;
@@ -1717,7 +1769,7 @@ print_bad_irq_dependency(struct task_struct *curr,
pr_warn("\n");
pr_warn("=====================================================\n");
pr_warn("WARNING: %s-safe -> %s-unsafe lock order detected\n",
- irqclass, irqclass);
+ state_name(bit1), state_name(bit2));
print_kernel_ident();
pr_warn("-----------------------------------------------------\n");
pr_warn("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] is trying to acquire:\n",
@@ -1737,15 +1789,15 @@ print_bad_irq_dependency(struct task_struct *curr,
pr_cont("\n");

pr_warn("\nbut this new dependency connects a %s-irq-safe lock:\n",
- irqclass);
+ state_name(bit1));
print_lock_name(backwards_entry->class);
- pr_warn("\n... which became %s-irq-safe at:\n", irqclass);
+ pr_warn("\n... which became %s-irq-safe at:\n", state_name(bit1));

print_stack_trace(backwards_entry->class->usage_traces + bit1, 1);

- pr_warn("\nto a %s-irq-unsafe lock:\n", irqclass);
+ pr_warn("\nto a %s-irq-unsafe lock:\n", state_name(bit2));
print_lock_name(forwards_entry->class);
- pr_warn("\n... which became %s-irq-unsafe at:\n", irqclass);
+ pr_warn("\n... which became %s-irq-unsafe at:\n", state_name(bit2));
pr_warn("...");

print_stack_trace(forwards_entry->class->usage_traces + bit2, 1);
@@ -1756,13 +1808,13 @@ print_bad_irq_dependency(struct task_struct *curr,

lockdep_print_held_locks(curr);

- pr_warn("\nthe dependencies between %s-irq-safe lock and the holding lock:\n", irqclass);
+ pr_warn("\nthe dependencies between %s-irq-safe lock and the holding lock:\n", state_name(bit1));
if (!save_trace(&prev_root->trace))
return 0;
print_shortest_lock_dependencies(backwards_entry, prev_root);

pr_warn("\nthe dependencies between the lock to be acquired");
- pr_warn(" and %s-irq-unsafe lock:\n", irqclass);
+ pr_warn(" and %s-irq-unsafe lock:\n", state_name(bit2));
if (!save_trace(&next_root->trace))
return 0;
print_shortest_lock_dependencies(forwards_entry, next_root);
@@ -1773,55 +1825,6 @@ print_bad_irq_dependency(struct task_struct *curr,
return 0;
}

-static int
-check_usage(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next, enum lock_usage_bit bit_backwards,
- enum lock_usage_bit bit_forwards, const char *irqclass)
-{
- int ret;
- struct lock_list this, that;
- struct lock_list *uninitialized_var(target_entry);
- struct lock_list *uninitialized_var(target_entry1);
-
- bfs_init_root(&this, prev);
- ret = find_usage_backwards(&this, bit_backwards, &target_entry);
- if (bfs_error(ret))
- return print_bfs_bug(ret);
- if (ret == BFS_RNOMATCH)
- return 1;
-
- bfs_init_root(&that, next);
- ret = find_usage_forwards(&that, bit_forwards, &target_entry1);
- if (bfs_error(ret))
- return print_bfs_bug(ret);
- if (ret == BFS_RNOMATCH)
- return 1;
-
- return print_bad_irq_dependency(curr, &this, &that,
- target_entry, target_entry1,
- prev, next,
- bit_backwards, bit_forwards, irqclass);
-}
-
-static const char *state_names[] = {
-#define LOCKDEP_STATE(__STATE) \
- __stringify(__STATE),
-#include "lockdep_states.h"
-#undef LOCKDEP_STATE
-};
-
-static const char *state_rnames[] = {
-#define LOCKDEP_STATE(__STATE) \
- __stringify(__STATE)"-RR",
-#include "lockdep_states.h"
-#undef LOCKDEP_STATE
-};
-
-static inline const char *state_name(enum lock_usage_bit bit)
-{
- return (bit & 1) ? state_rnames[bit >> 2] : state_names[bit >> 2];
-}
-
static int exclusive_bit(int new_bit)
{
/*
@@ -1844,32 +1847,66 @@ static int exclusive_bit(int new_bit)
return state | (dir ^ 2);
}

+/*
+ * We found a lock L whose class is @entry->class and L has a usage matching
+ * @dir_bit with usage_match() in a BFS search. And we need to figure out which
+ * exact usage bit is matched by usage_match().
+ *
+ * This function must be called after find_usage_{for,back)wards() and we find
+ * a match.
+ *
+ * Since we already find a match, if the lock doesn't have write usage, that
+ * means the usage we matched was read usage, otherwise it was write usage
+ * (because we check write usage in usage_match() first).
+ */
+static enum lock_usage_bit match_bit(struct lock_list *entry,
+ enum lock_usage_bit dir_bit)
+{
+ unsigned long mask = 1UL << dir_bit;
+ unsigned long usage = entry->class->usage_mask;
+
+ return dir_bit + !(usage & mask);
+}
+
static int check_irq_usage(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next, enum lock_usage_bit bit)
+ struct held_lock *next, enum lock_usage_bit dir_bit)
{
- /*
- * Prove that the new dependency does not connect a hardirq-safe
- * lock with a hardirq-unsafe lock - to achieve this we search
- * the backwards-subgraph starting at <prev>, and the
- * forwards-subgraph starting at <next>:
- */
- if (!check_usage(curr, prev, next, bit,
- exclusive_bit(bit), state_name(bit)))
- return 0;
+ int ret;
+ struct lock_list this, that;
+ struct lock_list *uninitialized_var(target_entry);
+ struct lock_list *uninitialized_var(target_entry1);
+ enum lock_usage_bit excl_bit;

- bit++; /* _READ */
+ /* the read/write bit should be zero */
+ if (WARN_ON_ONCE(dir_bit & 1))
+ dir_bit &= ~1;
+
+ excl_bit = exclusive_bit(dir_bit);

/*
- * Prove that the new dependency does not connect a hardirq-safe-read
- * lock with a hardirq-unsafe lock - to achieve this we search
- * the backwards-subgraph starting at <prev>, and the
- * forwards-subgraph starting at <next>:
+ * Prove that the new dependency does not connect a irq-safe lock with
+ * a irq-unsafe lock - to achieve this we search the backwards-subgraph
+ * starting at <prev>, and the forwards-subgraph starting at <next>:
*/
- if (!check_usage(curr, prev, next, bit,
- exclusive_bit(bit), state_name(bit)))
- return 0;
+ bfs_init_root(&this, prev);
+ ret = find_usage_backwards(&this, dir_bit, &target_entry);
+ if (bfs_error(ret))
+ return print_bfs_bug(ret);
+ if (ret == BFS_RNOMATCH)
+ return 1;

- return 1;
+ bfs_init_root(&that, next);
+ ret = find_usage_forwards(&that, excl_bit, &target_entry1);
+ if (bfs_error(ret))
+ return print_bfs_bug(ret);
+ if (ret == BFS_RNOMATCH)
+ return 1;
+
+ return print_bad_irq_dependency(curr, &this, &that,
+ target_entry, target_entry1,
+ prev, next,
+ match_bit(target_entry, dir_bit),
+ match_bit(target_entry1, excl_bit));
}

static int
@@ -2782,18 +2819,6 @@ print_usage_bug(struct task_struct *curr, struct held_lock *this,
return 0;
}

-/*
- * Print out an error if an invalid bit is set:
- */
-static inline int
-valid_state(struct task_struct *curr, struct held_lock *this,
- enum lock_usage_bit new_bit, enum lock_usage_bit bad_bit)
-{
- if (unlikely(hlock_class(this)->usage_mask & (1 << bad_bit)))
- return print_usage_bug(curr, this, bad_bit, new_bit);
- return 1;
-}
-
static int mark_lock(struct task_struct *curr, struct held_lock *this,
enum lock_usage_bit new_bit);

@@ -2863,27 +2888,48 @@ print_irq_inversion_bug(struct task_struct *curr,
return 0;
}

+/*
+ * Forwards and backwards subgraph searching, for the purposes of
+ * proving that a dependency path can become an illegal irq-safe ->
+ * irq-unsafe lock dependency because of the newly usage found.
+ *
+ * A special case other than what we describe in comments before usage_match()
+ * is, a lock L adds usage USED_IN while it already has ENABLED usage or vice
+ * versa, unless a lock only adds USED_IN_READ or ENABLED_READ usage to each
+ * other, which is not a deadlock.
+ *
+ * In the special case, find_usage_forwards() and find_usage_backwards() still
+ * work, because __bfs() will first try the root node. We just need to split
+ * this out for a different debug report (self irq inversion).
+ */
+
/*
* Prove that in the forwards-direction subgraph starting at <this>
* there is no lock matching <mask>:
*/
static int
check_usage_forwards(struct task_struct *curr, struct held_lock *this,
- enum lock_usage_bit bit, const char *irqclass)
+ enum lock_usage_bit bit, enum lock_usage_bit excl_bit)
{
enum bfs_result ret;
struct lock_list root;
struct lock_list *uninitialized_var(target_entry);

bfs_init_root(&root, this);
- ret = find_usage_forwards(&root, bit, &target_entry);
+ ret = find_usage_forwards(&root, excl_bit, &target_entry);
if (bfs_error(ret))
return print_bfs_bug(ret);
if (ret == BFS_RNOMATCH)
return ret;

- return print_irq_inversion_bug(curr, &root, target_entry,
- this, 1, irqclass);
+ /* self inversion */
+ if (target_entry->class == hlock_class(this))
+ return print_usage_bug(curr, this,
+ match_bit(target_entry, excl_bit),
+ bit);
+ else
+ return print_irq_inversion_bug(curr, &root, target_entry,
+ this, 1, state_name(bit));
}

/*
@@ -2892,21 +2938,27 @@ check_usage_forwards(struct task_struct *curr, struct held_lock *this,
*/
static int
check_usage_backwards(struct task_struct *curr, struct held_lock *this,
- enum lock_usage_bit bit, const char *irqclass)
+ enum lock_usage_bit bit, enum lock_usage_bit excl_bit)
{
enum bfs_result ret;
struct lock_list root;
struct lock_list *uninitialized_var(target_entry);

bfs_init_root(&root, this);
- ret = find_usage_backwards(&root, bit, &target_entry);
+ ret = find_usage_backwards(&root, excl_bit, &target_entry);
if (bfs_error(ret))
return print_bfs_bug(ret);
if (ret == BFS_RNOMATCH)
return 1;

- return print_irq_inversion_bug(curr, &root, target_entry,
- this, 0, irqclass);
+ /* self inversion */
+ if (target_entry->class == hlock_class(this))
+ return print_usage_bug(curr, this,
+ match_bit(target_entry, excl_bit),
+ bit);
+ else
+ return print_irq_inversion_bug(curr, &root, target_entry,
+ this, 0, state_name(bit));
}

void print_irqtrace_events(struct task_struct *curr)
@@ -2942,8 +2994,6 @@ static int SOFTIRQ_verbose(struct lock_class *class)
return 0;
}

-#define STRICT_READ_CHECKS 1
-
static int (*state_verbose_f[])(struct lock_class *class) = {
#define LOCKDEP_STATE(__STATE) \
__STATE##_verbose,
@@ -2965,44 +3015,21 @@ mark_lock_irq(struct task_struct *curr, struct held_lock *this,
enum lock_usage_bit new_bit)
{
int excl_bit = exclusive_bit(new_bit);
- int read = new_bit & 1;
- int dir = new_bit & 2;
-
- /*
- * mark USED_IN has to look forwards -- to ensure no dependency
- * has ENABLED state, which would allow recursion deadlocks.
- *
- * mark ENABLED has to look backwards -- to ensure no dependee
- * has USED_IN state, which, again, would allow recursion deadlocks.
- */
- check_usage_f usage = dir ?
- check_usage_backwards : check_usage_forwards;

- /*
- * Validate that this particular lock does not have conflicting
- * usage states.
- */
- if (!valid_state(curr, this, new_bit, excl_bit))
- return 0;
-
- /*
- * Validate that the lock dependencies don't have conflicting usage
- * states.
- */
- if ((!read || !dir || STRICT_READ_CHECKS) &&
- !usage(curr, this, excl_bit, state_name(new_bit & ~1)))
- return 0;
-
- /*
- * Check for read in write conflicts
- */
- if (!read) {
- if (!valid_state(curr, this, new_bit, excl_bit + 1))
+ if (new_bit & 2) {
+ /*
+ * mark ENABLED has to look backwards -- to ensure no dependee
+ * has USED_IN state, which, again, would allow recursion
+ * deadlocks.
+ */
+ if (!check_usage_backwards(curr, this, new_bit, excl_bit))
return 0;
-
- if (STRICT_READ_CHECKS &&
- !usage(curr, this, excl_bit + 1,
- state_name(new_bit + 1)))
+ } else {
+ /*
+ * mark USED_IN has to look forwards -- to ensure no dependency
+ * has ENABLED state, which would allow recursion deadlocks.
+ */
+ if (!check_usage_forwards(curr, this, new_bit, excl_bit))
return 0;
}

--
2.16.2

Next message: Boqun Feng: "[RFC tip/locking/lockdep v6 13/20] lockdep/selftest: Add a R-L/L-W test case specific to chain cache behavior"
Previous message: Boqun Feng: "[RFC tip/locking/lockdep v6 10/20] lockdep: Adjust check_redundant() for recursive read change"
In reply to: Boqun Feng: "[RFC tip/locking/lockdep v6 10/20] lockdep: Adjust check_redundant() for recursive read change"
Next in thread: Boqun Feng: "[RFC tip/locking/lockdep v6 13/20] lockdep/selftest: Add a R-L/L-W test case specific to chain cache behavior"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]