Re: [PATCH V3 03/10] PCI/TPH: Add pci=notph to prevent use of TPH

From: Wei Huang
Date: Mon Jul 29 2024 - 10:56:39 EST




On 7/25/24 16:29, Bjorn Helgaas wrote:
> On Wed, Jul 24, 2024 at 03:05:59PM -0500, Wei Huang wrote:
>>
>>
>> On 7/23/24 17:41, Bjorn Helgaas wrote:
>>> On Wed, Jul 17, 2024 at 03:55:04PM -0500, Wei Huang wrote:
>>>> TLP headers with incorrect steering tags (e.g. caused by buggy driver)
>>>> can potentially cause issues when the system hardware consumes the tags.
>>>
>>> Hmm. What kind of issues? Crash? Data corruption? Poor
>>> performance?
>>
>> Not crash or functionality errors. Usually it is QoS related because of
>> resource competition. AMD has
>
> Looks like you had more to say here?

I hit the send button too fast. What I wanted to say was there will be
AMD QoS patches to control TPH. Note that they will be hooked up under
x86/resctrl. Since they are AMD specific, it will be independent from
PCIe subsystem code.

>
> I *assume* that both the PH hint and the Steering Tags are only
> *hints* and there's no excuse for hardware to corrupt anything (e.g.,
> by omitting cache maintenance) even if the hint turns out to be wrong.
> If that's the case, I assume "can potentially cause issues" really
> just means "might lead to lower performance". That's what I want to
> clarify and confirm.

Corrrect, only QoS-related concerns. There won't be any correctness
concerns.

>
>>>> Provide a kernel option, with related helper functions, to completely
>>>> prevent TPH from being enabled.
>>>
>>> Also would be nice to have a hint about the difference between "notph"
>>> and "nostmode". Maybe that goes in the "nostmode" patch? I'm not
>>> super clear on all the differences here.
>>
>> I can combine them. Here is the combination and it meaning based on TPH
>> Control Register values:
>>
>> Requestor Enable | ST Mode | Meaning
>> ---------------------------------------------------------------
>> 00 | xx | TPH disabled (i.e. notph)
>> 01 | 00 | TPH enabled, NO ST Mode (i.e. nostmode)
>> 01 or 11 | 01 | Interrupt Vector mode
>> 01 or 11 | 10 | Device specific mode
>>
>> If you have any other thoughts on how to approach these modes, please
>> let me know.
>
> IIRC, there's no interface in this series that reall does anything
> with TPH per se; drivers would only use the ST-related things.
>
> If that's the case, maybe "pci=notph" isn't needed yet.

I can go with it. There will be a BIOS option to turn it off on AMD
platform. I would expect similar options on other vendors' platforms. So
I am not overly concerned about dropping pci=notph.

>
> Bjorn