Re: [PATCH 3/5] coresight: trbe: Keep TRBE disabled on overflow irq

From: Anshuman Khandual
Date: Sun Jul 18 2021 - 02:10:47 EST




On 7/15/21 1:58 PM, Suzuki K Poulose wrote:
> On 15/07/2021 04:54, Anshuman Khandual wrote:
>>
>>
>> On 7/12/21 5:08 PM, Suzuki K Poulose wrote:
>>> When an AUX buffer is marked TRUNCATED, the kernel will disable
>>> the event (via irq_work) and can only be re-enabled by the
>>> userspace tool.
>>
>> Will it be renabled via normal perf event enable callback OR an
>> explicit perf event restart is required upon the TRUNCATED flag ?
>
> The perf event moves to a DISABLED state. At this state an
> explicit restart from the userspace tool is required, via
> PERF_EVENT_IOC_ENABLE.

Got it.

>
>>
>>>
>>> Also, since we *always* mark the buffer as TRUNCATED (which is
>>> needs to be reconsidered, see below), we need not re-enable the
>>> TRBE as the event is going to be now disabled. This follows the
>>> SPE driver behavior.
>>>
>>> Without this change, we could leave the event disabled for
>>> ever under certain conditions. i.e, when we try to re-enable
>>> in the IRQ handler, if there is no space left in the buffer,
>>> we "TRUNCATE" the buffer and create a record of size 0.
>>> The perf tool skips such records and the event will remain
>>> disabled forever.
>>
>> Why ? Should not the user space tool explicitly start back the
>> tracing event after detecting zero sized record with buffer
>> marked with TRUNCATE flag ? What prevents it to restart the
>> event ?
>
> The perf tool discards a 0 sized packet. While I agree that
> this is something that should be fixed in perf tool too, this
> deviation from the "expected driver" behavior (see SPE driver
> which this one was inspired from) will break the existing
> perf tools (not an ABI break, but a functional issue
> which is not nice and generally discouraged in the kernel).

Makes sense not to cause deviation from the expected driver behaviour
for now, though perf tool should eventually accommodate a zero sized
trace record in the buffer and restart the event.

>
>>
>>>
>>> Regarding the use of TRUNCATED flag:
>>> With FILL mode, the TRBE stops collection when the buffer is
>>> full, raising the IRQ. Now, since the IRQ is asynchronous,
>>> we could loose some amount of trace, after the buffer was
>>> full until the IRQ is handled. Also the last packet may
>>> have been trimmed. The decoder can figure this out and
>>> cope with this. The side effect of this approach is:
>>>
>>>   We always disable the event when there is an IRQ, even
>>>   when the other half of the ring buffer is free ! This
>>>   is not ideal.
>>>
>>> Now, we should switch to using PARTIAL to indicate that there
>>> was potentially some partial trace packets in the buffer and
>>> some data was lost. We should use TRUNCATED only when there
>>> is absolutely no space in the ring buffer. This change would
>>> also require some additional changes in the CoreSight PMU
>>> framework to allow, sinks to "close" the handle (rather
>>> than the PMU driver closing the handle upon event_stop).
>>> So, until that is sorted, let us keep the TRUNCATED flag
>>> and the rework can be addressed separately.
>>
>> But I guess this is a separate problem all together.
>
> Yes, it is. As mentioned above, we need changes to the
> coresight PMU framework to be able to use the available
> buffer. And the message already is clear, that this
> is fixing the "odd" behavior.

Got it.

>
>>
>>>
>>> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
>>> Reported-by: Tamas Zsoldos <Tamas.Zsoldos@xxxxxxx>
>>> Cc: Anshuman Khandual <anshuman.khandual@xxxxxxx>
>>> Cc: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx>
>>> Cc: Mike Leach <mike.leach@xxxxxxxxxx>
>>> Cc: Leo Yan <leo.yan@xxxxxxxxxx>
>>> Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
>>> Cc: Will Deacon <will@xxxxxxxxxx>
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
>>> ---
>>>   drivers/hwtracing/coresight/coresight-trbe.c | 34 +++++++-------------
>>>   1 file changed, 12 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>>> index 176868496879..ec38cf17b693 100644
>>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>>> @@ -696,10 +696,8 @@ static void trbe_handle_spurious(struct perf_output_handle *handle)
>>>     static void trbe_handle_overflow(struct perf_output_handle *handle)
>>>   {
>>> -    struct perf_event *event = handle->event;
>>>       struct trbe_buf *buf = etm_perf_sink_config(handle);
>>>       unsigned long offset, size;
>>> -    struct etm_event_data *event_data;
>>>         offset = get_trbe_limit_pointer() - get_trbe_base_pointer();
>>>       size = offset - PERF_IDX2OFF(handle->head, buf);
>>> @@ -709,30 +707,22 @@ static void trbe_handle_overflow(struct perf_output_handle *handle)
>>>       /*
>>>        * Mark the buffer as truncated, as we have stopped the trace
>>>        * collection upon the WRAP event, without stopping the source.
>>> +     *
>>> +     * We don't re-enable the TRBE here, as the event is
>>> +     * bound to be disabled due to the TRUNCATED flag.
>>> +     * This is not ideal, as we could use the available space in
>>> +     * the ring buffer and continue the tracing.
>>> +     *
>>> +     * TODO: Revisit the use of TRUNCATED flag and may be instead use
>>> +     * PARTIAL, to indicate trace may contain partial packets.
>>> +     * And TRUNCATED can be used only if we do not have enough space
>>> +     * in the buffer. This would need additional changes in
>>> +     * etm_event_stop() to allow the sinks to leave a closed
>>> +     * aux_handle.
>>>        */
>>>       perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW |
>>>                        PERF_AUX_FLAG_TRUNCATED);
>>>       perf_aux_output_end(handle, size);
>>> -    event_data = perf_aux_output_begin(handle, event);
>>> -    if (!event_data) {
>>> -        /*
>>> -         * We are unable to restart the trace collection,
>>> -         * thus leave the TRBE disabled. The etm-perf driver
>>> -         * is able to detect this with a disconnected handle
>>> -         * (handle->event = NULL).
>>> -         */
>>> -        trbe_drain_and_disable_local();
>>> -        *this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL;
>>> -        return;
>>> -    }
>>> -    buf->trbe_limit = compute_trbe_buffer_limit(handle);
>>> -    buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
>>> -    if (buf->trbe_limit == buf->trbe_base) {
>>> -        trbe_stop_and_truncate_event(handle);
>>> -        return;
>>> -    }
>>> -    *this_cpu_ptr(buf->cpudata->drvdata->handle) = handle;
>>> -    trbe_enable_hw(buf);
>>>   }
>>
>> The change here just stops the event restart after handling the IRQ
>> and marking the buffer with PERF_AUX_FLAG_TRUNCATED which helps the
>> event from being disabled for ever (still need to understand how !).
>
> The real issue is unnecessary starting of the event with new buffer
> after we have "TRUNCATED" the buffer. This can lead to occassionally
> hitting "0" sized buffer, because the "irq_work_run()" could kick
> in and disable the event, leaving no trace generated (because we
> were tracing only userspace). So, we now have a 0 sized record
> with the event in disabled state, which the perf tool ignores.

Got it.

>
>>
>> I guess it might be better to separate out the problems with using
>> PERF_AUX_FLAG_TRUNCATED and related aspects, in a subsequent patch
>> which just updates the code with the above TODO comment ?
>
> The problems with the driver using TRUNCATED flag must be addressed
> in a different patch. This patch is only to comply with the behavior,
> of "TRUNCATED" indicates the event is disabled. So do not start a
> session.
>
> Does that help ?

Yes but I was just suggesting about listing out the problems with current
PERF_AUX_FLAG_TRUNCATED and possible usage of PERF_AUX_FLAG_PARTIAL instead,
along with the comments (as proposed here in this patch) in a separate patch
without any code change. This might help keep this patch's objective clear.