Re: [PATCH v1 1/3] docs: perf: Add description for Synopsys DesignWare PCIe PMU driver

From: Shuai Xue
Date: Fri Sep 23 2022 - 09:51:52 EST




在 2022/9/22 PM9:25, Will Deacon 写道:
> On Sat, Sep 17, 2022 at 08:10:34PM +0800, Shuai Xue wrote:
>> Alibaba's T-Head Yitan 710 SoC is built on Synopsys' widely deployed and
>> silicon-proven DesignWare Core PCIe controller which implements PMU for
>> performance and functional debugging to facilitate system maintenance.
>> Document it to provide guidance on how to use it.
>>
>> Signed-off-by: Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx>
>> ---
>> .../admin-guide/perf/dwc_pcie_pmu.rst | 61 +++++++++++++++++++
>> Documentation/admin-guide/perf/index.rst | 1 +
>> 2 files changed, 62 insertions(+)
>> create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst
>>
>> diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst
>> new file mode 100644
>> index 000000000000..fbcbf10b23b7
>> --- /dev/null
>> +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst
>> @@ -0,0 +1,61 @@
>> +======================================================================
>> +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU)
>> +======================================================================
>> +
>> +DesignWare Cores (DWC) PCIe PMU
>> +===============================
>> +
>> +To facilitate collection of statistics, Synopsys DesignWare Cores PCIe
>> +controller provides the following two features:
>> +
>> +- Time Based Analysis (RX/TX data throughput and time spent in each
>> + low-power LTSSM state)
>> +- Lane Event counters (Error and Non-Error for lanes)
>> +
>> +The PMU is not a PCIe Root Complex integrated End Point (RCiEP) device but
>> +only register counters provided by each PCIe Root Port.
>> +
>> +Time Based Analysis
>> +-------------------
>> +
>> +Using this feature you can obtain information regarding RX/TX data
>> +throughput and time spent in each low-power LTSSM state by the controller.
>> +
>> +The counters are 64-bit width and measure data in two categories,
>> +
>> +- percentage of time does the controller stay in LTSSM state in a
>> + configurable duration. The measurement range of each Event in Group#0.
>> +- amount of data processed (Units of 16 bytes). The measurement range of
>> + each Event in Group#1.
>> +
>> +Lane Event counters
>> +-------------------
>> +
>> +Using this feature you can obtain Error and Non-Error information in
>> +specific lane by the controller.
>> +
>> +The counters are 32-bit width and the measured event is select by:
>> +
>> +- Group i
>> +- Event j within the Group i
>> +- and Lank k
>> +
>> +Some of the event counters only exist for specific configurations.
>> +
>> +DesignWare Cores (DWC) PCIe PMU Driver
>> +=======================================
>> +
>> +This driver add PMU devices for each PCIe Root Port. And the PMU device is
>> +named based the BDF of Root Port. For example,
>> +
>> + 10:00.0 PCI bridge: Device 1ded:8000 (rev 01)
>> +
>> +the PMU device name for this Root Port is pcie_bdf_100000.
>> +
>> +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes)::
>> +
>> + $# perf stat -a -e pcie_bdf_200/Rx_PCIe_TLP_Data_Payload/
>
> Do you really need to expose a separate PMU instance to userspace for each
> BDF? I think it would be much cleaner if you could follow the approach used
> by hisilicon/hisi_pcie_pmu.c and hide these details in the driver, exposing
> a `bdf=' selector to userspace instead.

Thank you for your valuable comments.

It's a good idea to encode bdf in bitmap and exposing a `bdf=' selector to userspace.
The problem of bdf selector is that the user need to compute bdf from lanes, do you
think it is user friendly? I'm worried about increasing the burden of users.

Best Regards
Shuai