[tip:sched/core] sched/completions/Documentation: Add recommendation for dynamic and ONSTACK completions

From: tip-bot for Nicholas Mc Guire
Date: Wed Oct 17 2018 - 05:11:30 EST

Commit-ID: 11e13696a08e838ba48c72404c2b3f41429b5b20
Gitweb: https://git.kernel.org/tip/11e13696a08e838ba48c72404c2b3f41429b5b20
Author: Nicholas Mc Guire <hofrat@xxxxxxxxx>
AuthorDate: Tue, 16 Oct 2018 15:45:39 +0200
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Wed, 17 Oct 2018 08:30:10 +0200

sched/completions/Documentation: Add recommendation for dynamic and ONSTACK completions

To prevent dynamic completion objects from being de-allocated while still
in use, add a recommendation to embed them in long lived data structures.

Also add a note for the on-stack case that emphasizes the dangers of
the limited scope, and recommends dynamic allocation if scope limitations
are not clearly understood.

[ mingo: Minor touch-ups of the text, expanded it a bit to make the
warnings Nicholas added more prominent. ]

Signed-off-by: Nicholas Mc Guire <hofrat@xxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: john.garry@xxxxxxxxxx
Link: http://lkml.kernel.org/r/1539697539-24055-1-git-send-email-hofrat@xxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
Documentation/scheduler/completion.txt | 42 +++++++++++++++++++++++++++-------
1 file changed, 34 insertions(+), 8 deletions(-)

diff --git a/Documentation/scheduler/completion.txt b/Documentation/scheduler/completion.txt
index 91a11a668354..2dbff579f957 100644
--- a/Documentation/scheduler/completion.txt
+++ b/Documentation/scheduler/completion.txt
@@ -70,8 +70,18 @@ Good, intuitive naming (as always) helps code readability. Naming a completion
Initializing completions:

-Initialization of dynamically allocated completion objects, often embedded in
-other structures, is done via a call to init_completion():
+Dynamically allocated completion objects should preferably be embedded in data
+structures that are assured to be alive for the life-time of the function/driver,
+to prevent races with asynchronous complete() calls from occurring.
+Particular care should be taken when using the _timeout() or _killable()/_interruptible()
+variants of wait_for_completion(), as it must be assured that memory de-allocation
+does not happen until all related activities (complete() or reinit_completion())
+have taken place, even if these wait functions return prematurely due to a timeout
+or a signal triggering.
+Initializing of dynamically allocated completion objects is done via a call to


@@ -99,16 +109,32 @@ Note that in this case the completion is boot time (or module load time)
initialized to 'not done' and doesn't require an init_completion() call.

When a completion is declared as a local variable within a function,
-then the initialization should always use:
+then the initialization should always use DECLARE_COMPLETION_ONSTACK()
+explicitly, not just to make lockdep happy, but also to make it clear
+that limited scope had been considered and is intentional:


-A simple DECLARE_COMPLETION() on the stack makes lockdep unhappy.
Note that when using completion objects as local variables you must be
-aware of the short life time of the function stack: the function must
-not return to a calling context until all activities (such as waiting
-threads) have ceased and the completion is ... completely unused.
+acutely aware of the short life time of the function stack: the function
+must not return to a calling context until all activities (such as waiting
+threads) have ceased and the completion object is completely unused.
+To emphasise this again: in particular when using some of the waiting API variants
+with more complex outcomes, such as the timeout or signalling (_timeout(),
+_killable() and _interruptible()) variants, the wait might complete
+prematurely while the object might still be in use by another thread - and a return
+from the wait_on_completion*() caller function will deallocate the function
+stack and cause subtle data corruption if a complete() is done in some
+other thread. Simple testing might not trigger these kinds of races.
+If unsure, use dynamically allocated completion objects, preferably embedded
+in some other long lived object that has a boringly long life time which
+exceeds the life time of any helper threads using the completion object,
+or has a lock or other synchronization mechanism to make sure complete()
+is not called on a freed object.
+A naive DECLARE_COMPLETION() on the stack triggers a lockdep warning.

Waiting for completions: