Re: [PATCH] perf: fix perf_event_context->time

From: Namhyung Kim
Date: Wed Mar 01 2023 - 17:29:55 EST


Hi Song,

On Tue, Feb 28, 2023 at 11:22 AM Song Liu <song@xxxxxxxxxx> wrote:
>
> Time readers rely on perf_event_context->[time|timestamp|timeoffset] to get
> accurate time_enabled and time_running for an event. The difference between
> ctx->timestamp and ctx->time is the among of time when the context is not
> enabled. For cpuctx.ctx, time and timestamp should stay the same, however,

I'm not sure if it's correct. The timestamp can go when the context is disabled
for example, in ctx_resched() even if the NMI watchdog is enabled, right?

> it is not the case at the moment. To show this with drgn [1]:
>
> drgn 0.0.22 (using Python 3.8.6, elfutils 0.185, with libkdumpfile)
> For help, type help(drgn).
> ...
> >>> ctx = per_cpu_ptr(prog['pmu'].pmu_cpu_context, 0).ctx
> >>> ctx.timestamp * 1.0 / ctx.time
> (double)1.0385869134111765
> >>>
>
> ctx->timestamp is about 4% larger than ctx.time. This issue causes time
> read by perf_event_read_local() goes back in some cases.

I don't think the difference between the ctx.time and the timestamp
itself is a problem. I think the problem is it CAN update the timestamp
(and the timeoffset) while the context is enabled. Then depending on
the timing, event times can return smaller values than before.

>
> Fix this by updating the condition for __update_context_time(ctx, false).
> Specifically, it should only be called when we enable EVENT_TIME for the
> ctx.

So this change fixes the condition to update the timestamp only if
it moves from disabled to enabled state.

>
> [1] drgn: https://github.com/osandov/drgn
> Fixes: 09f5e7dc7ad7 ("perf: Fix perf_event_read_local() time")
> Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
> Signed-off-by: Song Liu <song@xxxxxxxxxx>

Actually I got some similar reports lately but couldn't find time to
dig in deeply. Thanks for the fix!

Acked-by: Namhyung Kim <namhyung@xxxxxxxxxx>

Thanks,
Namhyung


> ---
> kernel/events/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 380476a934e8..67478f43e26e 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3872,7 +3872,7 @@ ctx_sched_in(struct perf_event_context *ctx, enum event_type_t event_type)
> if (likely(!ctx->nr_events))
> return;
>
> - if (is_active ^ EVENT_TIME) {
> + if (!(is_active & EVENT_TIME)) {
> /* start ctx time */
> __update_context_time(ctx, false);
> perf_cgroup_set_timestamp(cpuctx);
> --
> 2.30.2
>