Re: [PATCH v9 3/4] drivers/perf: add DesignWare PCIe PMU driver
From: Shuai Xue
Date: Mon Oct 23 2023 - 05:39:45 EST
On 2023/10/23 10:05, Baolin Wang wrote:
>
>
> On 10/22/2023 3:47 PM, Shuai Xue wrote:
>> Hi, Baolin,
>>
>> I droped your Revivewed-by tag due to that I made significant changes to this
>> patch previously, please explicty give me Revivewed-by tag again if you are
>> happy with the changes.
>
> Yes, I am happy with this version (just some nits as below), and thanks for the review from other guys. Please feel free to add:
>
> Reviewed-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
Thank you.
Best Regards,
Shuai
>
>> On 2023/10/20 21:42, Shuai Xue wrote:
>>> This commit adds the PCIe Performance Monitoring Unit (PMU) driver support
>>> for T-Head Yitian SoC chip. Yitian is based on the Synopsys PCI Express
>>> Core controller IP which provides statistics feature. The PMU is a PCIe
>>> configuration space register block provided by each PCIe Root Port in a
>>> Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error
>>> injection, and Statistics).
>>>
>>> To facilitate collection of statistics the controller provides the
>>> following two features for each Root Port:
>>>
>>> - one 64-bit counter for Time Based Analysis (RX/TX data throughput and
>>> time spent in each low-power LTSSM state) and
>>> - one 32-bit counter for Event Counting (error and non-error events for
>>> a specified lane)
>>>
>>> Note: There is no interrupt for counter overflow.
>>>
>>> This driver adds PMU devices for each PCIe Root Port. And the PMU device is
>>> named based the BDF of Root Port. For example,
>>>
>>> 30:03.0 PCI bridge: Device 1ded:8000 (rev 01)
>>>
>>> the PMU device name for this Root Port is dwc_rootport_3018.
>>>
>>> Example usage of counting PCIe RX TLP data payload (Units of bytes)::
>>>
>>> $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/
>>>
>>> average RX bandwidth can be calculated like this:
>>>
>>> PCIe TX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window
>>>
>>> Signed-off-by: Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx>
>>> ---
>
> [snip]
>
>>> +static u64 dwc_pcie_pmu_read_time_based_counter(struct perf_event *event)
>>> +{
>>> + struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu);
>>> + struct pci_dev *pdev = pcie_pmu->pdev;
>>> + int event_id = DWC_PCIE_EVENT_ID(event);
>>> + u16 ras_des_offset = pcie_pmu->ras_des_offset;
>>> + u32 lo, hi, ss;
>>> +
>>> + /*
>>> + * The 64-bit value of the data counter is spread across two
>>> + * registers that are not synchronized. In order to read them
>>> + * atomically, ensure that the high 32 bits match before and after
>>> + * reading the low 32 bits.
>>> + */
>>> + pci_read_config_dword(pdev, ras_des_offset +
>>> + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_HIGH, &hi);
>>> + do {
>>> + /* snapshot the high 32 bits */
>>> + ss = hi;
>>> +
>>> + pci_read_config_dword(
>>> + pdev, ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_LOW,
>>> + &lo);
>>> + pci_read_config_dword(
>>> + pdev, ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_HIGH,
>>> + &hi);
>>> + } while (hi != ss);
>>> +
>>> + /*
>>> + * The Group#1 event measures the amount of data processed in 16-byte
>>> + * units. Simplify the end-user interface by multiplying the counter
>>> + * at the point of read.
>>> + */
>>> + if (event_id >= 0x20 && event_id <= 0x23)
>>> + return (((u64)hi << 32) | lo) << 4;
>>> + else
>
> You can drop the 'else'.
Agreed, will fix it in next version.
>>> +
>>> + event->cpu = pcie_pmu->on_cpu;
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static void dwc_pcie_pmu_set_period(struct hw_perf_event *hwc)
>>> +{
>>> + local64_set(&hwc->prev_count, 0);
>>> +}
>
> Only dwc_pcie_pmu_event_start() will call this small function, why just remove this function and move local64_set() into dwc_pcie_pmu_event_start()?
Good suggestion, will fix it.