Re: [PATCH] perf: Fix locking for children siblings group read
From: Peter Zijlstra
Date: Thu Jul 20 2017 - 10:42:06 EST
On Thu, Jul 20, 2017 at 04:14:55PM +0200, Jiri Olsa wrote:
> We're missing ctx lock when iterating children siblings
> within the perf_read path for group reading. Following
> race and crash can happen:
>
> User space doing read syscall on event group leader:
>
> T1:
> perf_read
> lock event->ctx->mutex
> perf_read_group
> lock leader->child_mutex
> __perf_read_group_add(child)
> list_for_each_entry(sub, &leader->sibling_list, group_entry)
>
> ----> sub might be invalid at this point, because it could
> get removed via perf_event_exit_task_context in T2
>
> Child exiting and cleaning up its events:
>
> T2:
> perf_event_exit_task_context
> lock ctx->mutex
> list_for_each_entry_safe(child_event, next, &child_ctx->event_list,...
> perf_event_exit_event(child)
> lock ctx->lock
> perf_group_detach(child)
> unlock ctx->lock
>
> ----> child is removed from sibling_list without any sync
> with T1 path above
>
> ...
> free_event(child)
>
> Before the child is removed from the leader's child_list,
> (and thus is omitted from perf_read_group processing), we
> need to ensure that perf_read_group touches child's
> siblings under its ctx->lock.
One additional note; this bug got exposed by commit:
ba5213ae6b88 ("perf/core: Correct event creation with PERF_FORMAT_GROUP")
which made it possible to actually trigger this code-path.
So while it doesn't fix a bug in that commit per-se, we should maybe
have a Fixes: tag with that commit in because that commit exposed it and
this should be added to any kernel that commit goes into.
> Tested-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Thanks Jiri!
> ---
> kernel/events/core.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 9747e422ab20..585955b41489 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -4365,7 +4365,9 @@ EXPORT_SYMBOL_GPL(perf_event_read_value);
> static int __perf_read_group_add(struct perf_event *leader,
> u64 read_format, u64 *values)
> {
> + struct perf_event_context *ctx = leader->ctx;
> struct perf_event *sub;
> + unsigned long flags;
> int n = 1; /* skip @nr */
> int ret;
>
> @@ -4395,12 +4397,15 @@ static int __perf_read_group_add(struct perf_event *leader,
> if (read_format & PERF_FORMAT_ID)
> values[n++] = primary_event_id(leader);
>
> + raw_spin_lock_irqsave(&ctx->lock, flags);
> +
> list_for_each_entry(sub, &leader->sibling_list, group_entry) {
> values[n++] += perf_event_count(sub);
> if (read_format & PERF_FORMAT_ID)
> values[n++] = primary_event_id(sub);
> }
>
> + raw_spin_unlock_irqrestore(&ctx->lock, flags);
> return 0;
> }
>
> --
> 2.9.4
>