Re: [RFC PATCH 1/1] perf/core: Wake up parent event if inherited event has no ring buffer

From: James Clark
Date: Mon Jan 24 2022 - 09:58:27 EST




On 24/01/2022 11:49, Peter Zijlstra wrote:
> On Mon, Dec 06, 2021 at 11:38:40AM +0000, James Clark wrote:
>> When using per-process mode and event inheritance is set to true, forked
>> processes will create a new perf events via inherit_event() ->
>> perf_event_alloc(). But these events will not have ring buffers assigned
>> to them. Any call to wakeup will be dropped if it's called on an event
>> with no ring buffer assigned because that's the object that holds the
>> wakeup list.
>>
>> If the child event is disabled due to a call to perf_aux_output_begin()
>> or perf_aux_output_end(), the wakeup is dropped leaving userspace
>> hanging forever on the poll.
>>
>> Normally the event is explicitly re-enabled by userspace after it wakes
>> up to read the aux data, but in this case it does not get woken up so
>> the event remains disabled.
>>
>> This can be reproduced when using Arm SPE and 'stress' which forks once
>> before running the workload. By looking at the list of aux buffers read,
>> it's apparent that they stop after the fork:
>>
>> perf record -e arm_spe// -vvv -- stress -c 1
>>
>> With this patch applied they continue to be printed. This behaviour
>> doesn't happen when using systemwide or per-cpu mode.
>>
>> Reported-by: Ruben Ayrapetyan <Ruben.Ayrapetyan@xxxxxxx>
>> Signed-off-by: James Clark <james.clark@xxxxxxx>
>> ---
>
> Would this be the better patch?

Yes I tested this and it also works. There is one other suspicious access
of ->rb followed by if(rb) here in perf_poll(), but maybe it works out ok?

mutex_lock(&event->mmap_mutex);
rb = event->rb;
if (rb)
events = atomic_xchg(&rb->poll, 0);

We also have a Perf self test that covers this failure for Arm SPE now, I'm not
sure if I should post that separately or with your new version of this fix?

Thanks
James

>
>
> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 479c9e672ec4..b1c1928c0e7c 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5985,6 +5985,8 @@ static void ring_buffer_attach(struct perf_event *event,
> struct perf_buffer *old_rb = NULL;
> unsigned long flags;
>
> + WARN_ON_ONCE(event->parent);
> +
> if (event->rb) {
> /*
> * Should be impossible, we set this when removing
> @@ -6042,6 +6044,9 @@ static void ring_buffer_wakeup(struct perf_event *event)
> {
> struct perf_buffer *rb;
>
> + if (event->parent)
> + event = event->parent;
> +
> rcu_read_lock();
> rb = rcu_dereference(event->rb);
> if (rb) {
> @@ -6055,6 +6060,9 @@ struct perf_buffer *ring_buffer_get(struct perf_event *event)
> {
> struct perf_buffer *rb;
>
> + if (event->parent)
> + event = event->parent;
> +
> rcu_read_lock();
> rb = rcu_dereference(event->rb);
> if (rb) {
> @@ -6763,7 +6771,7 @@ static unsigned long perf_prepare_sample_aux(struct perf_event *event,
> if (WARN_ON_ONCE(READ_ONCE(sampler->oncpu) != smp_processor_id()))
> goto out;
>
> - rb = ring_buffer_get(sampler->parent ? sampler->parent : sampler);
> + rb = ring_buffer_get(sampler);
> if (!rb)
> goto out;
>
> @@ -6829,7 +6837,7 @@ static void perf_aux_sample_output(struct perf_event *event,
> if (WARN_ON_ONCE(!sampler || !data->aux_size))
> return;
>
> - rb = ring_buffer_get(sampler->parent ? sampler->parent : sampler);
> + rb = ring_buffer_get(sampler);
> if (!rb)
> return;
>
>