[RFC v2 13/13] lockdep: Add a document describing crossrelease feature

From: Byungchul Park
Date: Thu Jul 07 2016 - 05:36:42 EST


Crossrelease feature introduces new concept and data structure. Thus
the document is necessary. So added it.

Signed-off-by: Byungchul Park <byungchul.park@xxxxxxx>
---
Documentation/locking/crossrelease.txt | 457 +++++++++++++++++++++++++++++++++
1 file changed, 457 insertions(+)
create mode 100644 Documentation/locking/crossrelease.txt

diff --git a/Documentation/locking/crossrelease.txt b/Documentation/locking/crossrelease.txt
new file mode 100644
index 0000000..51b583b
--- /dev/null
+++ b/Documentation/locking/crossrelease.txt
@@ -0,0 +1,457 @@
+Crossrelease dependency check
+=============================
+
+Started by Byungchul Park <byungchul.park@xxxxxxx>
+
+Contents:
+
+ (*) Introduction.
+
+ - What the lockdep checks.
+
+ (*) What is a problem?
+
+ - Examples.
+ - Lockdep's assumption.
+ - Lockdep's limitation.
+
+ (*) How to solve the problem.
+
+ - What causes deadlock?
+ - Relax the assumption.
+ - Relax the limitation.
+
+ (*) Implementation.
+
+ - Introduce crosslock.
+ - Introduce commit stage.
+ - Acquire vs commit vs release.
+ - Data structures.
+
+ (*) Optimizations.
+
+
+============
+Introduction
+============
+
+What the lockdep checks
+-----------------------
+
+Lockdep checks dependency between locks and reports it if a deadlock
+possibility is detected in advance or if an actual deadlock occures at
+the time checking it.
+
+The former is more valuable because the possibility detection without
+lockdep is much harder. When an actual deadlock occures, we can identify
+what happens in the system by some means or other, even without lockdep.
+
+It checks dependency between locks (exactly lock classes, see
+Documentation/locking/lockdep-design.txt for more) and builds the
+relationship between them when the lock is acquired. Eventually it forms
+the global dependency graph which connects between related locks based
+on dependency they have, like e.g.
+
+A - B - - F - G
+ \ /
+ - E - - L
+ / \ /
+C - D - - H -
+ \
+ - I - K
+ /
+ J -
+
+where A, B, C,..., L are different lock classes.
+
+Lockdep basically works based on this global dependency graph. The more
+nodes it has, the stronger check can be performed.
+
+For example, lockdep can detect a deadlock when acquiring a C class lock
+while it already held a K class lock, because it forms a cycle which
+means deadlock, like
+
+A - B - - F - G
+ \ /
+ - E - - L
+ / \ /
+C - D - - H -
+| \
+| - I - K
+| / |
+| J - |
++-------------------------+
+
+where A, B, C,..., L are different lock classes.
+
+If any of nodes making up the cycle was not built into the graph, we
+cannot detect this kind of deadlock because there's no cycle. For
+example, it might be
+
+A - B - - F - G
+ \ /
+ - E - L
+ /
+C - D -
+|
+| - I - K
+| / |
+| J - |
++-------------------------+
+
+where A, B, C,..., L are different lock classes.
+
+Adding nodes and edges gives us additional opportunity to check if it
+causes deadlock or not, so makes the detection stronger.
+
+
+=================
+What is a problem
+=================
+
+Examples
+--------
+
+Can we detect deadlocks descriped below, with lockdep? No.
+
+Example 1)
+
+ PROCESS X PROCESS Y
+ -------------- --------------
+ mutext_lock A
+ lock_page B
+ lock_page B
+ mutext_lock A // DEADLOCK
+ unlock_page B
+ mutext_unlock A
+ mutex_unlock A
+ unlock_page B
+
+We are not checking the dependency for lock_page() at all now.
+
+Example 2)
+
+ PROCESS X PROCESS Y PROCESS Z
+ -------------- -------------- --------------
+ mutex_lock A
+ lock_page B
+ lock_page B
+ mutext_lock A // DEADLOCK
+ mutext_unlock A
+ unlock_page B
+ (B was held by PROCESS X)
+ unlock_page B
+ mutex_unlock A
+
+We cannot detect this kind of deadlock with lockdep, even though we
+apply the dependency check using lockdep on lock_page().
+
+Example 3)
+
+ PROCESS X PROCESS Y
+ -------------- --------------
+ mutex_lock A
+ mutex_lock A
+ mutex_unlock A
+ wait_for_complete B // DEADLOCK
+ complete B
+ mutex_unlock A
+
+wait_for_complete() and complete() also can cause a deadlock, however
+we cannot detect it with lockdep, either.
+
+
+lockdep's assumption
+--------------------
+
+Lockdep (not crossrelease featured one) assumes that,
+
+ A lock will be unlocked within the context holding the lock.
+
+Thus we can say that a lock has dependency with all locks being already
+in held_locks. So we can check dependency and build the dependency graph
+when acquiring it.
+
+
+lockdep's limitation
+--------------------
+
+It can be applied only to typical lock operations, e.g. spin_lock,
+mutex and so on. However, e.g. lock_page() cannot play with the lockdep
+thingy because it violates assumptions of lockdep, even though it's
+considered as a lock for the page access.
+
+In the view point of lockdep, it must be released within the context
+which held the lock, however, the page lock can be released by different
+context from the context which held it. Another example,
+wait_for_complete() and complete() are also the case by nature, which
+the lockdep cannot deal with.
+
+
+========================
+How to solve the problem
+========================
+
+What causes deadlock
+--------------------
+
+Not only lock operations, but also any operations causing to wait or
+spin for something can cause deadlock unless it's eventually *released*
+by someone. The important point here is that the waiting or spinning
+must be *released* by someone. In other words, we have to focus whether
+the waiting or spinning can be *released* or not to check a deadlock
+possibility, rather than the waiting or spinning itself.
+
+In this point of view, typical lock is a special case where the acquire
+context is same as the release context, so no matter in which context
+the checking is performed for typical lock.
+
+Of course, in order to be able to report deadlock imediately at the time
+real deadlock actually occures, the checking must be performed before
+actual blocking or spinning happens when acquiring it. However, deadlock
+*possibility* can be detected and reported even the checking is done
+when releasing it, which means the time we can identify the release
+context.
+
+
+Relax the assumption
+--------------------
+
+Since whether waiting or spinning can be released or not is more
+important to check deadlock possibility as decribed above, we can relax
+the assumtion the lockdep has.
+
+ A lock can be unlocked in any context, unless the context
+ itself causes a deadlock e.g. acquiring a lock in irq-safe
+ context before releasing the lock in irq-unsafe context.
+
+A lock has dependency with all locks in the release context, having been
+held since the lock was held. Thus basically we can check the dependency
+only after we identify the release context at first. Of course, we can
+identify the release context at the time acquiring a lock for typical
+lock because the acquire context is same as the release context.
+
+However, generally if we cannot identify the release context at the time
+acquiring a lock, we have to wait until the lock having been held will
+be eventually released to identify the release context.
+
+
+Relax the limitation
+--------------------
+
+Given that the assumption is relaxed, we can check dependency and detect
+deadlock possibility not only for typical lock, but also for lock_page()
+using PG_locked, wait_for_xxx() and so on which might be released by
+different context from the context which held the lock.
+
+
+==============
+Implementation
+==============
+
+Introduce crosslock
+-------------------
+
+Crossrelease feature names a lock "crosslock" if it is releasable by a
+different context from the context which held the lock. All locks
+having been held in the context unlocking the crosslock, since the
+crosslock was held until eventually it's unlocked, have dependency with
+the crosslock. That's the key idea to implement crossrelease feature.
+
+
+Introduce commit stage
+----------------------
+
+Crossrelease feature names it "commit", to check dependency and build
+the dependency graph and chain in batches. Lockdep already performs what
+the commit does, when acquiring a lock. This way it works for typical
+lock, since it's guarrented that the acquire context is same as the
+release context for that. However, that way must be changed for
+crosslock so that it identify the release context when releasing the
+crosslock and then performs the commit.
+
+Let's demonstrate it though several examples.
+
+The below is how the current lockdep works for typical lock. Note that
+context 1 == context 2 for typical lock.
+
+ CONTEXT 1 CONTEXT 2
+ acquiring A releasing A
+ ------------- -------------
+ acquire A acquire A
+
+ acquire B acquire B -> build "A - B"
+
+ acquire C acquire C -> build "A - C"
+
+ release C release C
+
+ release B release B
+
+ release A release A
+
+After building "A - B", the dependency graph forms like,
+
+A - B
+
+And after building "A - C", the dependency graph forms like,
+
+ - B
+ /
+A -
+ \
+ - C
+
+What if we apply the commit to lockdep for typical lock? Of course, it's
+not necessary for typical lock. Just a example for what the commit does.
+
+ CONTEXT 1 CONTEXT 2
+ acquiring A releasing A
+ ------------- -------------
+ acquire A acquire A -> mark A started
+
+ acquire B acquire B -> queue B
+
+ acquire C acquire C -> queue C
+
+ release C release C
+
+ release B release B
+
+ release A release A -> commit A (build "A - B", "A - C")
+
+After commiting A, the dependency graph forms like, at a time
+
+ - B
+ /
+A -
+ \
+ - C
+
+Here we can see both the former and the latter end in building a same
+dependency graph consisting of "A - B" and "A - C". Of course, the
+former can build the graph earlier than the latter, which means the
+former can detect a deadlock sooner, maybe, as soon as possible. So the
+former would be prefered if possible.
+
+Let's look at the way the commit works for crosslock.
+
+ CONTEXT 1 CONTEXT 2
+ acquiring A releasing A
+ ------------- -------------
+ lock A
+ acquire A -> mark A started
+
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <- serialized by some means
+
+ acquire X -> queue X
+ lock X
+ acquire B
+ release X
+ acquire C
+ acquire Y -> queue Y
+ release C
+ release Y
+ release B
+ release A -> commit A (build "A - X", "A - Y")
+
+where A is a crosslock and the others are typical locks.
+
+Since a crosslock is held, it starts to queue all candidates whenever
+acquiring typical lock, and keep it until finally it will be released.
+Then it can commit all proper candidates queued until now. In other
+words, it checks dependency and builds the dependency graph and chain
+at the commit stage.
+
+And it is serialized so that the sequence, A -> X -> Y can be seen,
+using atomic operations and proper barriers.
+
+
+Acquire vs commit vs release
+----------------------------
+
+What the lockdep should do in each stage to make it work even for
+crosslock is like, (Note that it does not change any current logic
+by which lockdep works, but only adds additional detection capability.)
+
+1. Acquire
+
+ 1) For typical lock
+
+ It's queued into additional data structure so that the
+ commit stage can check depedndency and build the
+ dependency graph and chain with this later.
+
+ 2) For crosslock
+
+ It's added to the global linked list so that the commit
+ stage can check dependency and build the dependency
+ graph and chain with this later.
+
+2. Commit
+
+ 1) For typical lock
+
+ N/A.
+
+ 2) For crosslock
+
+ It checks dependency and builds the dependency graph and
+ chain with data saved in the acquire stage. Here, we
+ establish dependency between the crosslock we are
+ unlocking now and all locks in that context, having been
+ held since the lock was held. Of course, it avoids
+ unnecessary checking and building as far as possible.
+
+3. Release
+
+ 1) For typical lock
+
+ No change.
+
+ 2) For crosslock
+
+ Just remove this crosslock from the global linked list,
+ to which the crosslock was added at acquire stage.
+ Release operation should be used with commit operation
+ together for crosslock.
+
+
+Data structures
+---------------
+
+Crossrelease feature introduces two new data structures.
+
+1. pend_lock (or plock)
+
+ This is for keeping locks waiting to be committed so that the
+ actual dependency graph and chain can be built in the commit
+ stage. Every task_struct has a pend_lock array to keep these
+ locks. pend_lock entry will be consumed and filled whenever
+ lock_acquire() is called for typical lock and will be flushed,
+ namely committed at proper time. And this array is managed by
+ circular buffer mean.
+
+2. cross_lock (or xlock)
+
+ This keeps some additional data only for crosslock. One
+ instance exists per one crosslock's lockdep_map.
+ lockdep_init_map_crosslock() should be used instead of
+ lockdep_init_map() to use a lock as a crosslock.
+
+
+=============
+Optimizations
+=============
+
+Adding a pend_lock is an operation which very frequently happened
+because it happens whenever a typical lock is acquired. So the operation
+is implemented locklessly using rcu mechanism if possible. Unfortunitly,
+we cannot apply this optimization if any object managed by rcu e.g.
+xlock is on stack or somewhere else where it can be freed or destroyed
+unpredictably.
+
+And chain cache for crosslock is also used to avoid unnecessary checking
+and building dependency, like how the lockdep is already doing for that
+purpose for typical lock.
+
--
1.9.1