Re: [PATCH RESEND 0/4] Add support for HiSilicon PCIe Tune and Trace device

From: Yicong Yang
Date: Thu Apr 22 2021 - 08:54:59 EST


On 2021/4/22 11:49, Leo Yan wrote:
> On Mon, Apr 19, 2021 at 09:03:18PM +0800, Yicong Yang wrote:
>> On 2021/4/17 21:56, Alexander Shishkin wrote:
>>> Yicong Yang <yangyicong@xxxxxxxxxxxxx> writes:
>>>
>>>> The reason for not using perf is because there is no current support
>>>> for uncore tracing in the perf facilities.
>>>
>>> Not unless you count
>>>
>>> $ perf list|grep -ic uncore
>>> 77
>>>
>>
>> these are uncore events probably do not support sampling.
>>
>> I tried on x86:
>>
>> # ./perf record -e uncore_imc_0/cas_count_read/
>> Error:
>> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_imc_0/cas_count_read/).
>> /bin/dmesg | grep -i perf may provide additional information.
>>
>> For HiSilicon uncore PMUs, we don't support uncore sampling:
>>
>> 'The current driver does not support sampling. So "perf record" is unsupported. ' [1]
>>
>> and also in another PMU:
>>
>> 'PMU doesn't support process specific events and cannot be used in sampling mode.' [2]
>>
>> [1] Documentation/admin-guide/perf/hisi-pmu.rst
>> [2] Documentation/admin-guide/perf/arm_dsu_pmu.rst
>
> I did some debugging for this, and yes, it's related with the event
> doesn't support sampling for these x86 uncore events.
>
> So I can use below commands for the uncore event
> 'uncore_imc/data_reads/' in my experiment:
>
> # perf record -e 'uncore_imc/data_reads/' --no-samples -- ls
> # perf stat -e 'uncore_imc/data_reads/' -- ls
>
> For your case, I think you need to write the callback
> pmu::event_init(), it should not forbid any tracing even if set
> sampling, just like other perf event drive for support AUX tracing.
>

thanks for the hint! I didn't know much about perf so I only do
the basic test. will further investigate on this.

>>>> We have our own format
>>>> of data and don't need perf doing the parsing.
>>>
>>> Perf has AUX buffers, which are used for all kinds of own formats.
>>>
>>
>> ok. we thought perf will break the data format but AUX buffers seems won't.
>> do we need to add full support for tracing as well as parsing or it's ok for
>> not parsing it through perf?
>
> IMHO, this could divide into two parts. The first part is to enable
> perf drive with support AUX tracing, and perf tool can capture the trace
> data. The second part is to add the decoder in the perf tool so that
> the developers can *consume* the trace data; for the decoder, you
> could refer the codes:
>
> tools/perf/util/intel-pt-decoder/
> tools/perf/util/cs-etm-decoder/
>
> Or Arm SPE case:
>
> tools/perf/util/arm-spe-decoder/
>

will refer to these implementation to see how to add the decoder for our
traced data. very detailed guidance!

>>>> A similar approach for implementing this function is ETM, which use
>>>> sysfs for configuring and a character device for dumping data.
>>>
>>> And also perf. One reason ETM has a sysfs interface is because the
>>> driver predates perf's AUX buffers. Can't say if it's the only
>>> reason. I'm assuming you're talking about Coresight ETM.
>
> I am not the best person to give background for this. Mathieu or Mike
> could give more info for this. From my undersanding, Sysfs nodes can
> be used as knobs for configuration, but it's difficult for profiling.
>

as explained by the maintainers that there are some historical reasons for
having sysfs interfaces for ETM as there is no perf AUX buffers at
beginning. I thought sysfs interface as an option but perf AUX buffer
is better as suggested.

> Let's think about for the profiling, if one developer uses the Sysfs
> for the setting and read out the trace data, these informations are
> discrete. If another developer wants to review the profiling result,
> then all these info need to be shared together.
>

ok. make sense to me.

> So we can benefit much from the perf tool for the usage, since all the
> profiling context will be gathered (DSOs, hardware configuration which
> can be saved into metadata), so the final profiling file can be easily
> shared and more friendly for reviewing.
>

ok. it will be beneficial if we use perf for both tracing and decoding,
as we'll also get addition information attached to the trace data.

Considering we have two functions: tracing and tuning. For tracing we
can make use of perf AUX buffer but for tuning, I still cannot see how to
make use of perf. So probably we can make tuning go through sysfs?
And Daniel suggested so.

Appreciate for the suggestion and guidance!

Regards,
Yicong

> Thanks,
> Leo
>
> .
>