Re: [RFC PATCH v2 2/4] perf: Allow adding fixed random jitter to the alternate sampling period
From: James Clark
Date: Mon Apr 22 2024 - 09:08:46 EST
On 22/04/2024 11:49, Ben Gainey wrote:
> This change modifies the core perf overflow handler, adding some small
> random jitter to each sample period whenever an event switches between the
> two alternate sample periods. A new flag is added to perf_event_attr to
> opt into this behaviour.
>
> This change follows the discussion in [1], where it is recognized that it
> may be possible for certain patterns of execution to end up with biased
> results.
>
> [1] https://lore.kernel.org/linux-perf-users/Zc24eLqZycmIg3d2@tassilo/
>
> Signed-off-by: Ben Gainey <ben.gainey@xxxxxxx>
> ---
> include/uapi/linux/perf_event.h | 3 ++-
> kernel/events/core.c | 11 ++++++++++-
> 2 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 5c1701d091cf..dd3697a4b300 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -461,7 +461,8 @@ struct perf_event_attr {
> inherit_thread : 1, /* children only inherit if cloned with CLONE_THREAD */
> remove_on_exec : 1, /* event is removed from task on exec */
> sigtrap : 1, /* send synchronous SIGTRAP on event */
> - __reserved_1 : 26;
> + jitter_alternate_period : 1, /* add a limited amount of jitter on each alternate period */
> + __reserved_1 : 25;
>
> union {
> __u32 wakeup_events; /* wakeup every n events */
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 07f1f931e18e..079ae520e836 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -15,6 +15,7 @@
> #include <linux/idr.h>
> #include <linux/file.h>
> #include <linux/poll.h>
> +#include <linux/random.h>
> #include <linux/slab.h>
> #include <linux/hash.h>
> #include <linux/tick.h>
> @@ -9546,6 +9547,8 @@ static inline bool sample_is_allowed(struct perf_event *event, struct pt_regs *r
> return true;
> }
>
> +# define MAX_ALT_SAMPLE_PERIOD_JITTER 16
> +
Is 16 enough to make a difference with very large alternate periods? I'm
wondering if it's worth making it customisable and instead of adding the
boolean option add a 16 bit jitter field. Or the option could still be a
boolean but the jitter value is some ratio of the alt sample period, like:
get_random_u32_below(max(16, alternative_sample_period >> 4))
> /*
> * Generic event overflow handling, sampling.
> */
> @@ -9573,7 +9576,10 @@ static int __perf_event_overflow(struct perf_event *event,
> if (event->attr.alternative_sample_period) {
> bool using_alt = hwc->using_alternative_sample_period;
> u64 sample_period = (using_alt ? event->attr.sample_period
> - : event->attr.alternative_sample_period);
> + : event->attr.alternative_sample_period)
> + + (event->attr.jitter_alternate_period
> + ? get_random_u32_below(MAX_ALT_SAMPLE_PERIOD_JITTER)
> + : 0);
>
> hwc->sample_period = sample_period;
> hwc->using_alternative_sample_period = !using_alt;
> @@ -12503,6 +12509,9 @@ SYSCALL_DEFINE5(perf_event_open,
> }
> }
>
> + if (attr.jitter_alternate_period && !attr.alternative_sample_period)
> + return -EINVAL;
> +
> /* Only privileged users can get physical addresses */
> if ((attr.sample_type & PERF_SAMPLE_PHYS_ADDR)) {
> err = perf_allow_kernel(&attr);