Re: [RFC v2 13/13] lockdep: Add a document describing crossrelease feature

From: Byungchul Park
Date: Sun Jul 17 2016 - 21:41:38 EST


On Thu, Jul 07, 2016 at 06:30:03PM +0900, Byungchul Park wrote:
> Crossrelease feature introduces new concept and data structure. Thus
> the document is necessary. So added it.

Any opinions about this suggestion?

>
> Signed-off-by: Byungchul Park <byungchul.park@xxxxxxx>
> ---
> Documentation/locking/crossrelease.txt | 457 +++++++++++++++++++++++++++++++++
> 1 file changed, 457 insertions(+)
> create mode 100644 Documentation/locking/crossrelease.txt
>
> diff --git a/Documentation/locking/crossrelease.txt b/Documentation/locking/crossrelease.txt
> new file mode 100644
> index 0000000..51b583b
> --- /dev/null
> +++ b/Documentation/locking/crossrelease.txt
> @@ -0,0 +1,457 @@
> +Crossrelease dependency check
> +=============================
> +
> +Started by Byungchul Park <byungchul.park@xxxxxxx>
> +
> +Contents:
> +
> + (*) Introduction.
> +
> + - What the lockdep checks.
> +
> + (*) What is a problem?
> +
> + - Examples.
> + - Lockdep's assumption.
> + - Lockdep's limitation.
> +
> + (*) How to solve the problem.
> +
> + - What causes deadlock?
> + - Relax the assumption.
> + - Relax the limitation.
> +
> + (*) Implementation.
> +
> + - Introduce crosslock.
> + - Introduce commit stage.
> + - Acquire vs commit vs release.
> + - Data structures.
> +
> + (*) Optimizations.
> +
> +
> +============
> +Introduction
> +============
> +
> +What the lockdep checks
> +-----------------------
> +
> +Lockdep checks dependency between locks and reports it if a deadlock
> +possibility is detected in advance or if an actual deadlock occures at
> +the time checking it.
> +
> +The former is more valuable because the possibility detection without
> +lockdep is much harder. When an actual deadlock occures, we can identify
> +what happens in the system by some means or other, even without lockdep.
> +
> +It checks dependency between locks (exactly lock classes, see
> +Documentation/locking/lockdep-design.txt for more) and builds the
> +relationship between them when the lock is acquired. Eventually it forms
> +the global dependency graph which connects between related locks based
> +on dependency they have, like e.g.
> +
> +A - B - - F - G
> + \ /
> + - E - - L
> + / \ /
> +C - D - - H -
> + \
> + - I - K
> + /
> + J -
> +
> +where A, B, C,..., L are different lock classes.
> +
> +Lockdep basically works based on this global dependency graph. The more
> +nodes it has, the stronger check can be performed.
> +
> +For example, lockdep can detect a deadlock when acquiring a C class lock
> +while it already held a K class lock, because it forms a cycle which
> +means deadlock, like
> +
> +A - B - - F - G
> + \ /
> + - E - - L
> + / \ /
> +C - D - - H -
> +| \
> +| - I - K
> +| / |
> +| J - |
> ++-------------------------+
> +
> +where A, B, C,..., L are different lock classes.
> +
> +If any of nodes making up the cycle was not built into the graph, we
> +cannot detect this kind of deadlock because there's no cycle. For
> +example, it might be
> +
> +A - B - - F - G
> + \ /
> + - E - L
> + /
> +C - D -
> +|
> +| - I - K
> +| / |
> +| J - |
> ++-------------------------+
> +
> +where A, B, C,..., L are different lock classes.
> +
> +Adding nodes and edges gives us additional opportunity to check if it
> +causes deadlock or not, so makes the detection stronger.
> +
> +
> +=================
> +What is a problem
> +=================
> +
> +Examples
> +--------
> +
> +Can we detect deadlocks descriped below, with lockdep? No.
> +
> +Example 1)
> +
> + PROCESS X PROCESS Y
> + -------------- --------------
> + mutext_lock A
> + lock_page B
> + lock_page B
> + mutext_lock A // DEADLOCK
> + unlock_page B
> + mutext_unlock A
> + mutex_unlock A
> + unlock_page B
> +
> +We are not checking the dependency for lock_page() at all now.
> +
> +Example 2)
> +
> + PROCESS X PROCESS Y PROCESS Z
> + -------------- -------------- --------------
> + mutex_lock A
> + lock_page B
> + lock_page B
> + mutext_lock A // DEADLOCK
> + mutext_unlock A
> + unlock_page B
> + (B was held by PROCESS X)
> + unlock_page B
> + mutex_unlock A
> +
> +We cannot detect this kind of deadlock with lockdep, even though we
> +apply the dependency check using lockdep on lock_page().
> +
> +Example 3)
> +
> + PROCESS X PROCESS Y
> + -------------- --------------
> + mutex_lock A
> + mutex_lock A
> + mutex_unlock A
> + wait_for_complete B // DEADLOCK
> + complete B
> + mutex_unlock A
> +
> +wait_for_complete() and complete() also can cause a deadlock, however
> +we cannot detect it with lockdep, either.
> +
> +
> +lockdep's assumption
> +--------------------
> +
> +Lockdep (not crossrelease featured one) assumes that,
> +
> + A lock will be unlocked within the context holding the lock.
> +
> +Thus we can say that a lock has dependency with all locks being already
> +in held_locks. So we can check dependency and build the dependency graph
> +when acquiring it.
> +
> +
> +lockdep's limitation
> +--------------------
> +
> +It can be applied only to typical lock operations, e.g. spin_lock,
> +mutex and so on. However, e.g. lock_page() cannot play with the lockdep
> +thingy because it violates assumptions of lockdep, even though it's
> +considered as a lock for the page access.
> +
> +In the view point of lockdep, it must be released within the context
> +which held the lock, however, the page lock can be released by different
> +context from the context which held it. Another example,
> +wait_for_complete() and complete() are also the case by nature, which
> +the lockdep cannot deal with.
> +
> +
> +========================
> +How to solve the problem
> +========================
> +
> +What causes deadlock
> +--------------------
> +
> +Not only lock operations, but also any operations causing to wait or
> +spin for something can cause deadlock unless it's eventually *released*
> +by someone. The important point here is that the waiting or spinning
> +must be *released* by someone. In other words, we have to focus whether
> +the waiting or spinning can be *released* or not to check a deadlock
> +possibility, rather than the waiting or spinning itself.
> +
> +In this point of view, typical lock is a special case where the acquire
> +context is same as the release context, so no matter in which context
> +the checking is performed for typical lock.
> +
> +Of course, in order to be able to report deadlock imediately at the time
> +real deadlock actually occures, the checking must be performed before
> +actual blocking or spinning happens when acquiring it. However, deadlock
> +*possibility* can be detected and reported even the checking is done
> +when releasing it, which means the time we can identify the release
> +context.
> +
> +
> +Relax the assumption
> +--------------------
> +
> +Since whether waiting or spinning can be released or not is more
> +important to check deadlock possibility as decribed above, we can relax
> +the assumtion the lockdep has.
> +
> + A lock can be unlocked in any context, unless the context
> + itself causes a deadlock e.g. acquiring a lock in irq-safe
> + context before releasing the lock in irq-unsafe context.
> +
> +A lock has dependency with all locks in the release context, having been
> +held since the lock was held. Thus basically we can check the dependency
> +only after we identify the release context at first. Of course, we can
> +identify the release context at the time acquiring a lock for typical
> +lock because the acquire context is same as the release context.
> +
> +However, generally if we cannot identify the release context at the time
> +acquiring a lock, we have to wait until the lock having been held will
> +be eventually released to identify the release context.
> +
> +
> +Relax the limitation
> +--------------------
> +
> +Given that the assumption is relaxed, we can check dependency and detect
> +deadlock possibility not only for typical lock, but also for lock_page()
> +using PG_locked, wait_for_xxx() and so on which might be released by
> +different context from the context which held the lock.
> +
> +
> +==============
> +Implementation
> +==============
> +
> +Introduce crosslock
> +-------------------
> +
> +Crossrelease feature names a lock "crosslock" if it is releasable by a
> +different context from the context which held the lock. All locks
> +having been held in the context unlocking the crosslock, since the
> +crosslock was held until eventually it's unlocked, have dependency with
> +the crosslock. That's the key idea to implement crossrelease feature.
> +
> +
> +Introduce commit stage
> +----------------------
> +
> +Crossrelease feature names it "commit", to check dependency and build
> +the dependency graph and chain in batches. Lockdep already performs what
> +the commit does, when acquiring a lock. This way it works for typical
> +lock, since it's guarrented that the acquire context is same as the
> +release context for that. However, that way must be changed for
> +crosslock so that it identify the release context when releasing the
> +crosslock and then performs the commit.
> +
> +Let's demonstrate it though several examples.
> +
> +The below is how the current lockdep works for typical lock. Note that
> +context 1 == context 2 for typical lock.
> +
> + CONTEXT 1 CONTEXT 2
> + acquiring A releasing A
> + ------------- -------------
> + acquire A acquire A
> +
> + acquire B acquire B -> build "A - B"
> +
> + acquire C acquire C -> build "A - C"
> +
> + release C release C
> +
> + release B release B
> +
> + release A release A
> +
> +After building "A - B", the dependency graph forms like,
> +
> +A - B
> +
> +And after building "A - C", the dependency graph forms like,
> +
> + - B
> + /
> +A -
> + \
> + - C
> +
> +What if we apply the commit to lockdep for typical lock? Of course, it's
> +not necessary for typical lock. Just a example for what the commit does.
> +
> + CONTEXT 1 CONTEXT 2
> + acquiring A releasing A
> + ------------- -------------
> + acquire A acquire A -> mark A started
> +
> + acquire B acquire B -> queue B
> +
> + acquire C acquire C -> queue C
> +
> + release C release C
> +
> + release B release B
> +
> + release A release A -> commit A (build "A - B", "A - C")
> +
> +After commiting A, the dependency graph forms like, at a time
> +
> + - B
> + /
> +A -
> + \
> + - C
> +
> +Here we can see both the former and the latter end in building a same
> +dependency graph consisting of "A - B" and "A - C". Of course, the
> +former can build the graph earlier than the latter, which means the
> +former can detect a deadlock sooner, maybe, as soon as possible. So the
> +former would be prefered if possible.
> +
> +Let's look at the way the commit works for crosslock.
> +
> + CONTEXT 1 CONTEXT 2
> + acquiring A releasing A
> + ------------- -------------
> + lock A
> + acquire A -> mark A started
> +
> + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <- serialized by some means
> +
> + acquire X -> queue X
> + lock X
> + acquire B
> + release X
> + acquire C
> + acquire Y -> queue Y
> + release C
> + release Y
> + release B
> + release A -> commit A (build "A - X", "A - Y")
> +
> +where A is a crosslock and the others are typical locks.
> +
> +Since a crosslock is held, it starts to queue all candidates whenever
> +acquiring typical lock, and keep it until finally it will be released.
> +Then it can commit all proper candidates queued until now. In other
> +words, it checks dependency and builds the dependency graph and chain
> +at the commit stage.
> +
> +And it is serialized so that the sequence, A -> X -> Y can be seen,
> +using atomic operations and proper barriers.
> +
> +
> +Acquire vs commit vs release
> +----------------------------
> +
> +What the lockdep should do in each stage to make it work even for
> +crosslock is like, (Note that it does not change any current logic
> +by which lockdep works, but only adds additional detection capability.)
> +
> +1. Acquire
> +
> + 1) For typical lock
> +
> + It's queued into additional data structure so that the
> + commit stage can check depedndency and build the
> + dependency graph and chain with this later.
> +
> + 2) For crosslock
> +
> + It's added to the global linked list so that the commit
> + stage can check dependency and build the dependency
> + graph and chain with this later.
> +
> +2. Commit
> +
> + 1) For typical lock
> +
> + N/A.
> +
> + 2) For crosslock
> +
> + It checks dependency and builds the dependency graph and
> + chain with data saved in the acquire stage. Here, we
> + establish dependency between the crosslock we are
> + unlocking now and all locks in that context, having been
> + held since the lock was held. Of course, it avoids
> + unnecessary checking and building as far as possible.
> +
> +3. Release
> +
> + 1) For typical lock
> +
> + No change.
> +
> + 2) For crosslock
> +
> + Just remove this crosslock from the global linked list,
> + to which the crosslock was added at acquire stage.
> + Release operation should be used with commit operation
> + together for crosslock.
> +
> +
> +Data structures
> +---------------
> +
> +Crossrelease feature introduces two new data structures.
> +
> +1. pend_lock (or plock)
> +
> + This is for keeping locks waiting to be committed so that the
> + actual dependency graph and chain can be built in the commit
> + stage. Every task_struct has a pend_lock array to keep these
> + locks. pend_lock entry will be consumed and filled whenever
> + lock_acquire() is called for typical lock and will be flushed,
> + namely committed at proper time. And this array is managed by
> + circular buffer mean.
> +
> +2. cross_lock (or xlock)
> +
> + This keeps some additional data only for crosslock. One
> + instance exists per one crosslock's lockdep_map.
> + lockdep_init_map_crosslock() should be used instead of
> + lockdep_init_map() to use a lock as a crosslock.
> +
> +
> +=============
> +Optimizations
> +=============
> +
> +Adding a pend_lock is an operation which very frequently happened
> +because it happens whenever a typical lock is acquired. So the operation
> +is implemented locklessly using rcu mechanism if possible. Unfortunitly,
> +we cannot apply this optimization if any object managed by rcu e.g.
> +xlock is on stack or somewhere else where it can be freed or destroyed
> +unpredictably.
> +
> +And chain cache for crosslock is also used to avoid unnecessary checking
> +and building dependency, like how the lockdep is already doing for that
> +purpose for typical lock.
> +
> --
> 1.9.1