Re: [PATCH V5 5/6] perf, x86: drain PEBS buffer during context switch

From: Peter Zijlstra
Date: Mon Mar 30 2015 - 09:51:03 EST


On Mon, Feb 23, 2015 at 09:25:55AM -0500, Kan Liang wrote:
> From: Yan, Zheng <zheng.z.yan@xxxxxxxxx>
>
> Flush the PEBS buffer during context switch if PEBS interrupt threshold
> is larger than one. This allows perf to supply TID for sample outputs.
>
> Signed-off-by: Yan, Zheng <zheng.z.yan@xxxxxxxxx>
> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxx>
> ---
> arch/x86/kernel/cpu/perf_event.h | 3 +++
> arch/x86/kernel/cpu/perf_event_intel.c | 11 +++++++++-
> arch/x86/kernel/cpu/perf_event_intel_ds.c | 33 ++++++++++++++++++++++++++++--
> arch/x86/kernel/cpu/perf_event_intel_lbr.c | 3 ---
> 4 files changed, 44 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
> index bc4ae3b..b4f6431 100644
> --- a/arch/x86/kernel/cpu/perf_event.h
> +++ b/arch/x86/kernel/cpu/perf_event.h
> @@ -151,6 +151,7 @@ struct cpu_hw_events {
> */
> struct debug_store *ds;
> u64 pebs_enabled;
> + bool pebs_sched_cb_enabled;
>
> /*
> * Intel LBR bits

Why do we need that extra state? I would've expected to see a inc/dec
for every AUTO_RELOAD that gets added/removed.

> @@ -704,13 +717,20 @@ void intel_pmu_pebs_enable(struct perf_event *event)
> * When the event is constrained enough we can use a larger
> * threshold and run the event with less frequent PMI.
> */
> - if (0 && /* disable this temporarily */
> - (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) &&
> + if ((hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) &&
> !(event->attr.sample_type & ~PEBS_FREERUNNING_FLAGS)) {
> threshold = ds->pebs_absolute_maximum -
> x86_pmu.max_pebs_events * x86_pmu.pebs_record_size;
> + if (first_pebs) {
> + perf_sched_cb_inc(event->ctx->pmu);
> + cpuc->pebs_sched_cb_enabled = true;
> + }
> } else {
> threshold = ds->pebs_buffer_base + x86_pmu.pebs_record_size;
> + if (cpuc->pebs_sched_cb_enabled) {
> + perf_sched_cb_dec(event->ctx->pmu);
> + cpuc->pebs_sched_cb_enabled = false;
> + }
> }
> if (first_pebs || ds->pebs_interrupt_threshold > threshold)
> ds->pebs_interrupt_threshold = threshold;

I'm confused, why do you do sched_cb_dec for every event that wasn't
AUTO_RELOAD?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/