[tip:perf/core] perf_events: Fix bogus context time tracking

From: tip-bot for Stephane Eranian
Date: Mon Oct 18 2010 - 15:18:27 EST


Commit-ID: c530ccd9a1864a44a7ff35826681229ce9f2357a
Gitweb: http://git.kernel.org/tip/c530ccd9a1864a44a7ff35826681229ce9f2357a
Author: Stephane Eranian <eranian@xxxxxxxxxx>
AuthorDate: Fri, 15 Oct 2010 15:26:01 +0200
Committer: Ingo Molnar <mingo@xxxxxxx>
CommitDate: Mon, 18 Oct 2010 19:58:46 +0200

perf_events: Fix bogus context time tracking

You can only call update_context_time() when the context
is active, i.e., the thread it is attached to is still running.

However, perf_event_read() can be called even when the context
is inactive, e.g., user read() the counters. The call to
update_context_time() must be conditioned on the status of
the context, otherwise, bogus time_enabled, time_running may
be returned. Here is an example on AMD64. The task program
is an example from libpfm4. The -p prints deltas every 1s.

$ task -p -e cpu_clk_unhalted sleep 5
2,266,610 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
5,242,358,071 cpu_clk_unhalted (99.95% scaling, ena=5,000,359,984, run=2,319,270)

Whereas if you don't read deltas, e.g., no call to perf_event_read() until
the process terminates:

$ task -e cpu_clk_unhalted sleep 5
2,497,783 cpu_clk_unhalted (0.00% scaling, ena=2,376,899, run=2,376,899)

Notice that time_enable, time_running are bogus in the first example
causing bogus scaling.

This patch fixes the problem, by conditionally calling update_context_time()
in perf_event_read().

Signed-off-by: Stephane Eranian <eranian@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Cc: stable@xxxxxxxxxx
LKML-Reference: <4cb856dc.51edd80a.5ae0.38fb@xxxxxxxxxxxxx>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
kernel/perf_event.c | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 1ec3916..e7eeba1 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -1780,7 +1780,13 @@ static u64 perf_event_read(struct perf_event *event)
unsigned long flags;

raw_spin_lock_irqsave(&ctx->lock, flags);
- update_context_time(ctx);
+ /*
+ * may read while context is not active
+ * (e.g., thread is blocked), in that case
+ * we cannot update context time
+ */
+ if (ctx->is_active)
+ update_context_time(ctx);
update_event_times(event);
raw_spin_unlock_irqrestore(&ctx->lock, flags);
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/