Re: [PATCH 03/12] perf: Allocate context task_ctx_data for child event

From: Peter Zijlstra
Date: Mon Jan 08 2018 - 07:14:36 EST


On Sun, Jan 07, 2018 at 05:03:47PM +0100, Jiri Olsa wrote:
> Currently we use perf_event_context::task_ctx_data to save
> and restore the LBR status when the task is scheduled out
> and in.
>
> We don't allocate it for child contexts, which results in
> shorter task's LBR stack, because we don't save the history
> from previous run and start over every time we schedule the
> task in.
>
> I made a test to generate samples with LBR call stack
> and got higher numbers on bigger chain depths:
>
> before: after:
> LBR call chain: nr: 1 60561 498127
> LBR call chain: nr: 2 0 0
> LBR call chain: nr: 3 107030 2172
> LBR call chain: nr: 4 466685 62758
> LBR call chain: nr: 5 2307319 878046
> LBR call chain: nr: 6 48713 495218
> LBR call chain: nr: 7 1040 4551
> LBR call chain: nr: 8 481 172
> LBR call chain: nr: 9 878 120
> LBR call chain: nr: 10 2377 6698
> LBR call chain: nr: 11 28830 151487
> LBR call chain: nr: 12 29347 339867
> LBR call chain: nr: 13 4 22
> LBR call chain: nr: 14 3 53

Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>

Fixes: 4af57ef28c2c ("perf: Add pmu specific data for perf task context")

> Cc: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
> ---
> kernel/events/core.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 4df5b695bf0d..55fb648a32b0 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -10703,6 +10703,19 @@ inherit_event(struct perf_event *parent_event,
> if (IS_ERR(child_event))
> return child_event;
>
> +
> + if ((child_event->attach_state & PERF_ATTACH_TASK_DATA) &&
> + !child_ctx->task_ctx_data) {
> + struct pmu *pmu = child_event->pmu;
> +
> + child_ctx->task_ctx_data = kzalloc(pmu->task_ctx_size,
> + GFP_KERNEL);
> + if (!child_ctx->task_ctx_data) {
> + free_event(child_event);
> + return NULL;
> + }
> + }
> +
> /*
> * is_orphaned_event() and list_add_tail(&parent_event->child_list)
> * must be under the same lock in order to serialize against
> @@ -10713,6 +10726,7 @@ inherit_event(struct perf_event *parent_event,
> if (is_orphaned_event(parent_event) ||
> !atomic_long_inc_not_zero(&parent_event->refcount)) {
> mutex_unlock(&parent_event->child_mutex);
> + /* task_ctx_data is freed with child_ctx */
> free_event(child_event);
> return NULL;
> }
> --
> 2.13.6
>