Re: [PATCH v6 00/16] locking/lockdep: Add support for dynamic keys

From: Bart Van Assche
Date: Fri Jan 11 2019 - 12:01:47 EST


On Fri, 2019-01-11 at 17:55 +-0100, Peter Zijlstra wrote:
+AD4 On Fri, Jan 11, 2019 at 07:55:03AM -0800, Bart Van Assche wrote:
+AD4 +AD4 On Fri, 2019-01-11 at 13:48 +-0100, Peter Zijlstra wrote:
+AD4 +AD4 +AD4 I spotted this new v6 in my inbox and have rebased to it.
+AD4 +AD4
+AD4 +AD4 Thanks+ACE
+AD4 +AD4
+AD4 +AD4 +AD4 On Wed, Jan 09, 2019 at 01:01:48PM -0800, Bart Van Assche wrote:
+AD4 +AD4 +AD4
+AD4 +AD4 +AD4 +AD4 The changes compared to v5 are:
+AD4 +AD4 +AD4 +AD4 - Modified zap+AF8-class() such that it doesn't try to free a list entry that
+AD4 +AD4 +AD4 +AD4 is already being freed.
+AD4 +AD4 +AD4
+AD4 +AD4 +AD4 I however have a question on this+ADs this seems wrong. Once a list entry
+AD4 +AD4 +AD4 is enqueued it should not be reachable anymore. If we can reach an entry
+AD4 +AD4 +AD4 after call+AF8-rcu() happened, we've got a problem.
+AD4 +AD4
+AD4 +AD4 Apparently I confused you - sorry that I was not more clear. What I meant is
+AD4 +AD4 that I changed a single if test into a loop. The graph lock is held while that
+AD4 +AD4 loop is being executed so the code below is serialized against the code called
+AD4 +AD4 from inside the RCU callback:
+AD4 +AD4
+AD4 +AD4 +AEAAQA -4574,8 +-4563,9 +AEAAQA static void zap+AF8-class(struct pending+AF8-free +ACo-pf, struct lock
+AD4 +AD4 +AF8-class +ACo-class)
+AD4 +AD4 entry +AD0 list+AF8-entries +- i+ADs
+AD4 +AD4 if (entry-+AD4-class +ACEAPQ class +ACYAJg entry-+AD4-links+AF8-to +ACEAPQ class)
+AD4 +AD4 continue+ADs
+AD4 +AD4 - if (+AF8AXw-test+AF8-and+AF8-set+AF8-bit(i, pf-+AD4-list+AF8-entries+AF8-being+AF8-freed))
+AD4 +AD4 +- if (list+AF8-entry+AF8-being+AF8-freed(i))
+AD4 +AD4 continue+ADs
+AD4
+AD4 Yes, it is the above change that caught my eye.. That checks +AF8-both+AF8 your
+AD4 lists. One is your current open one (+AEA-pf), but the other could already
+AD4 be pending the call+AF8-rcu().
+AD4
+AD4 So my question is why do we have to check both ?+ACE How come the old code,
+AD4 that only checked +AEA-pf, is wrong?
+AD4
+AD4 +AD4 +- set+AF8-bit(i, pf-+AD4-list+AF8-entries+AF8-being+AF8-freed)+ADs
+AD4 +AD4 nr+AF8-list+AF8-entries--+ADs
+AD4 +AD4 list+AF8-del+AF8-rcu(+ACY-entry-+AD4-entry)+ADs
+AD4 +AD4 +AH0

The list+AF8-del+AF8-rcu() call must only happen once. I ran into complaints reporting that
the list+AF8-del+AF8-rcu() call triggered list corruption. This change made these complaints
disappear.

Bart.