Re: [PATCH V9 7/8] perf, x86: introduce PERF_RECORD_LOST_SAMPLES

From: Arnaldo Carvalho de Melo
Date: Mon May 11 2015 - 15:06:57 EST


Em Sun, May 10, 2015 at 03:13:14PM -0400, Kan Liang escreveu:
> From: Kan Liang <kan.liang@xxxxxxxxx>
>
> After enlarging the PEBS interrupt threshold, there may be some mixed up
> PEBS samples which are discarded by kernel. This patch drives the kernel
> to emit a PERF_RECORD_LOST_SAMPLES record with the number of possible
> discards when it is impossible to demux the samples. It makes sure the
> user is not left in the dark about such discards.

ok, but would be nice to spell out what the tooling needs to do here,
i.e. when more than one event is mapping to the same mmap ring buffer,
the user has to use perf_event_attr.sample_id_all and have
PERF_SAMPLE_ID in its perf_event_attr.sample_type, if disambiguating the
event is desired. I.e. the discarded stuff is what is in the
PERF_SAMPLE_ID payload, when present.

Probably is what you did when using this in the tooling, lemme see...
;-)

- Arnaldo

> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxx>
> ---
> arch/x86/kernel/cpu/perf_event_intel_ds.c | 20 +++++++++++++++----
> include/linux/perf_event.h | 3 +++
> include/uapi/linux/perf_event.h | 12 +++++++++++
> kernel/events/core.c | 33 +++++++++++++++++++++++++++++++
> 4 files changed, 64 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> index 328b10c..18afea0b 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> @@ -1127,6 +1127,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
> void *base, *at, *top;
> int bit;
> short counts[MAX_PEBS_EVENTS] = {};
> + short error[MAX_PEBS_EVENTS] = {};
>
> if (!x86_pmu.pebs_active)
> return;
> @@ -1170,21 +1171,32 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
> /* slow path */
> pebs_status = p->status & cpuc->pebs_enabled;
> pebs_status &= (1ULL << MAX_PEBS_EVENTS) - 1;
> - if (pebs_status != (1 << bit))
> + if (pebs_status != (1 << bit)) {
> + u8 i;
> +
> + for_each_set_bit(i, (unsigned long *)&pebs_status,
> + MAX_PEBS_EVENTS)
> + error[i]++;
> continue;
> + }
> }
> counts[bit]++;
> }
>
> for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) {
> - if (counts[bit] == 0)
> + if ((counts[bit] == 0) && (error[bit] == 0))
> continue;
> event = cpuc->events[bit];
> WARN_ON_ONCE(!event);
> WARN_ON_ONCE(!event->attr.precise_ip);
>
> - __intel_pmu_pebs_event(event, iregs, base,
> - top, bit, counts[bit]);
> + /* log dropped samples number */
> + if (error[bit])
> + perf_log_lost_samples(event, error[bit]);
> +
> + if (counts[bit])
> + __intel_pmu_pebs_event(event, iregs, base,
> + top, bit, counts[bit]);
> }
> }
>
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index bed1b6f..d47d792 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -747,6 +747,9 @@ perf_event__output_id_sample(struct perf_event *event,
> struct perf_output_handle *handle,
> struct perf_sample_data *sample);
>
> +extern void
> +perf_log_lost_samples(struct perf_event *event, u64 lost);
> +
> static inline bool is_sampling_event(struct perf_event *event)
> {
> return event->attr.sample_period != 0;
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 309211b..bab1938 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -800,6 +800,18 @@ enum perf_event_type {
> */
> PERF_RECORD_ITRACE_START = 12,
>
> + /*
> + * Records the dropped/lost sample number.
> + *
> + * struct {
> + * struct perf_event_header header;
> + *
> + * u64 lost;
> + * struct sample_id sample_id;
> + * };
> + */
> + PERF_RECORD_LOST_SAMPLES = 13,
> +
> PERF_RECORD_MAX, /* non-ABI */
> };
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 4d221a4..42f82c5 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5927,6 +5927,39 @@ void perf_event_aux_event(struct perf_event *event, unsigned long head,
> }
>
> /*
> + * Lost/dropped samples logging
> + */
> +void perf_log_lost_samples(struct perf_event *event, u64 lost)
> +{
> + struct perf_output_handle handle;
> + struct perf_sample_data sample;
> + int ret;
> +
> + struct {
> + struct perf_event_header header;
> + u64 lost;
> + } lost_samples_event = {
> + .header = {
> + .type = PERF_RECORD_LOST_SAMPLES,
> + .misc = 0,
> + .size = sizeof(lost_samples_event),
> + },
> + .lost = lost,
> + };
> +
> + perf_event_header__init_id(&lost_samples_event.header, &sample, event);
> +
> + ret = perf_output_begin(&handle, event,
> + lost_samples_event.header.size);
> + if (ret)
> + return;
> +
> + perf_output_put(&handle, lost_samples_event);
> + perf_event__output_id_sample(event, &handle, &sample);
> + perf_output_end(&handle);
> +}
> +
> +/*
> * IRQ throttle logging
> */
>
> --
> 1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/