Re: [PATCH V6 0/5] PCIe TPH and cache direct injection support

From: Wei Huang
Date: Wed Oct 02 2024 - 13:19:25 EST




On 9/30/24 11:55, Bjorn Helgaas wrote:
> On Fri, Sep 27, 2024 at 04:56:48PM -0500, Wei Huang wrote:
>> Hi All,
>>
>> TPH (TLP Processing Hints) is a PCIe feature that allows endpoint
>> devices to provide optimization hints for requests that target memory
>> space. These hints, in a format called steering tag (ST), are provided
>> in the requester's TLP headers and allow the system hardware, including
>> the Root Complex, to optimize the utilization of platform resources
>> for the requests.
>>
>> Upcoming AMD hardware implement a new Cache Injection feature that
>> leverages TPH. Cache Injection allows PCIe endpoints to inject I/O
>> Coherent DMA writes directly into an L2 within the CCX (core complex)
>> closest to the CPU core that will consume it. This technology is aimed
>> at applications requiring high performance and low latency, such as
>> networking and storage applications.
>>
>> This series introduces generic TPH support in Linux, allowing STs to be
>> retrieved and used by PCIe endpoint drivers as needed. As a
>> demonstration, it includes an example usage in the Broadcom BNXT driver.
>> When running on Broadcom NICs with the appropriate firmware, it shows
>> substantial memory bandwidth savings and better network bandwidth using
>> real-world benchmarks. This solution is vendor-neutral and implemented
>> based on industry standards (PCIe Spec and PCI FW Spec).
>>
>> V5->V6:
>> * Rebase on top of pci/main (tag: pci-v6.12-changes)
>> * Fix spellings and FIELD_PREP/bnxt.c compilation errors (Simon)
>> * Move tph.c to drivers/pci directory (Lukas)
>> * Remove CONFIG_ACPI dependency (Lukas)
>> * Slightly re-arrange save/restore sequence (Lukas)
>
> Thanks, I'll wait for the kernel test robot warnings to be resolved.

Fixed in V7. I tested with clang 18.1.8 and compilation with W=1 passed.

>
> In patch 2/5, reword commit logs as imperative mood, e.g.,
> s/X() is added/Add X()/, as you've already done for 1/5 and 3/5.

Fixed

>
> Maybe specify the ACPI _DSM name? This would help users know whether
> their system can use this, or help them request that a vendor
> implement the _DSM.

I added more information (rev, func, name) in the code and in the commit
message. End-users should be able to use this info to locate this _DSM
method in ACPI DSL.

>
> In patch 4/5, s/sustancial/substantial/. I guess the firmware you
> refer to here means the system firmware that would provide the _DSM
> required for this to work, i.e., not firmware on the NIC itself?
> Would be helpful for users to have a hint as to how to tell whether to
> expect a benefit on their system.

We need both NIC FW and system FW to support TPH. I revised the commit
message to clarify.

I cannot speak for Broadcom on which NIC FW version will support it. I
know that, upon release, this feature will be backward compatible and
carried on in the future.

>
> The 5/5 commit log could say what the patch *does*, not what *could*
> be done (the subject does say what the patch does, but it's nice if
> it's in the commit log as well so it's complete by itself). Also, a
> hint that using the steering tag helps direct DMA writes to a cache
> close to the CPU expected to consume it might be helpful to motivate
> the patch.

Fixed. Thanks you for the detailed feedback!

>
> Bjorn