Re: perf: recursive locking of ctx->mutex

From: Peter Zijlstra
Date: Fri May 01 2015 - 10:13:56 EST



A little something like so should cure that me thinks.

I would much appreciate other people reviewing this with care though; my
snot addled brain isn't too bright.


---
Subject: perf: Annotate inherited event ctx->mutex recursion
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Fri May 1 16:08:46 CEST 2015

While fuzzing Sasha tripped over another ctx->mutex recursion lockdep
splat. Annotate this.

Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Vince Weaver <vincent.weaver@xxxxxxxxx>
Reported-by: Sasha Levin <sasha.levin@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
kernel/events/core.c | 43 ++++++++++++++++++++++++++++++++++---------
1 file changed, 34 insertions(+), 9 deletions(-)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -914,10 +914,30 @@ static void put_ctx(struct perf_event_co
* Those places that change perf_event::ctx will hold both
* perf_event_ctx::mutex of the 'old' and 'new' ctx value.
*
- * Lock ordering is by mutex address. There is one other site where
- * perf_event_context::mutex nests and that is put_event(). But remember that
- * that is a parent<->child context relation, and migration does not affect
- * children, therefore these two orderings should not interact.
+ * Lock ordering is by mutex address. There are two other sites where
+ * perf_event_context::mutex nests and those are:
+ *
+ * - perf_event_exit_task_context() [ child , 0 ]
+ * __perf_event_exit_task()
+ * sync_child_event()
+ * put_event() [ parent, 1 ]
+ *
+ * - perf_event_init_context() [ parent, 0 ]
+ * inherit_task_group()
+ * inherit_group()
+ * inherit_event()
+ * perf_event_alloc()
+ * perf_init_event()
+ * perf_try_init_event() [ child , 1 ]
+ *
+ * While it appears there is an obvious deadlock here -- the parent and child
+ * nesting levels are inverted between the two. This is in fact safe because
+ * life-time rules separate them. That is an exiting task cannot fork, and a
+ * spawning task cannot (yet) exit.
+ *
+ * But remember that that these are parent<->child context relations, and
+ * migration does not affect children, therefore these two orderings should not
+ * interact.
*
* The change in perf_event::ctx does not affect children (as claimed above)
* because the sys_perf_event_open() case will install a new event and break
@@ -3658,9 +3678,6 @@ static void perf_remove_from_owner(struc
}
}

-/*
- * Called when the last reference to the file is gone.
- */
static void put_event(struct perf_event *event)
{
struct perf_event_context *ctx;
@@ -3698,6 +3715,9 @@ int perf_event_release_kernel(struct per
}
EXPORT_SYMBOL_GPL(perf_event_release_kernel);

+/*
+ * Called when the last reference to the file is gone.
+ */
static int perf_release(struct inode *inode, struct file *file)
{
put_event(file->private_data);
@@ -7370,7 +7390,12 @@ static int perf_try_init_event(struct pm
return -ENODEV;

if (event->group_leader != event) {
- ctx = perf_event_ctx_lock(event->group_leader);
+ /*
+ * This ctx->mutex can nest when we're called through
+ * inheritance. See the perf_event_ctx_lock_nested() comment.
+ */
+ ctx = perf_event_ctx_lock_nested(event->group_leader,
+ SINGLE_DEPTH_NESTING);
BUG_ON(!ctx);
}

@@ -8728,7 +8753,7 @@ static int perf_event_init_context(struc
* Lock the parent list. No need to lock the child - not PID
* hashed yet and not running, so nobody can access it.
*/
- mutex_lock(&parent_ctx->mutex);
+ mutex_lock_nested(&parent_ctx->mutex);

/*
* We dont have to disable NMIs - we are only looking at
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/