Re: [PATCH] PCI/ACPI: Decouple the negotiation of ASPM and other PCIe services

From: Yicong Yang
Date: Mon Apr 11 2022 - 05:30:48 EST


Hi Bjorn, Rafael,

On 2022/4/8 0:41, Rafael J. Wysocki wrote:
> On Thu, Apr 7, 2022 at 5:43 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>>
>> [+cc Rafael]
>>
>> On Thu, Apr 07, 2022 at 09:16:02PM +0800, Yicong Yang wrote:
>>> Currently we regard ASPM as a necessary PCIe service and if it's disabled
>>> by pcie_aspm=off we cannot enable other services like AER and hotplug.
>>> However the ASPM is just one of the PCIe services and other services
>>> mentioned no dependency on this. So this patch decouples the negotiation
>>> of ASPM and other PCIe services, then we can make use of other services
>>> in the absence of ASPM.
>>
>> Why do you want to boot with "pcie_aspm=off"? If we have to use a
>> PCI-related parameter to boot, something is already wrong, so if
>> there's a problem that requires ASPM to be disabled, we should fix
>> that first.
>>

We found this when testing the functions of AER and hotplug. The pcie_aspm=off
is added by the tester who found it affect the function and it makes us think
that it maybe not reasonable to couple these 3 services together.

>> If there's a known hardware or firmware issue with ASPM, we should
>> quirk it so users don't have to discover this parameter.
>>
>>> Aaron Sierra tried to fix this originally:
>>> https://lore.kernel.org/linux-pci/20190702201318.GC128603@xxxxxxxxxx/
>>
>> Yes. My question from that review is still open:
>>
>> But Rafael added ACPI_PCIE_REQ_SUPPORT with 415e12b23792 ("PCI/ACPI:
>> Request _OSC control once for each root bridge (v3)") [1], apparently
>> related to a bug [2]. I assume there was some reason for requiring
>> all those things together, so I'd really like his comments.
>
> Well, it was quite a few years ago.
>
>> [1] https://git.kernel.org/linus/415e12b23792
>> [2] https://bugzilla.kernel.org/show_bug.cgi?id=20232
>>
>> Rafael clearly said in [1] that we need to:
>>
>> ... check if all of the requisite _OSC support bits are set before
>> calling acpi_pci_osc_control_set() for a given root complex.
>
> IIRC, the idea was to avoid requesting native control of anything PCIe
> if those bits were not set in the mask, because otherwise we wouldn't
> be able to get PME and native hotplug control which were not
> configurable at that time. [PME is still not configurable and
> potentially related to hotplug, because they may use the same MSI IRQ
> in principle, but the native hotplug is configurable now anyway.]
>

I'm a bit confused about the 'configurable' here, is it only about PME
and hotplug share the same interrupt? Currently the PME and hotplug
interrupt is allocated by the pcie port driver and PME can get the irq
if the interrupt is allocated successfully.

>> We would still need to explain why Rafael thought all these _OSC
>> support bits were required, but now they're not.
>>
>> _OSC does not negotiate directly for control of ASPM (though of course
>> it *does* negotiate for control of the PCIe Capability, which contains
>> the ASPM control bits), but the PCI Firmware spec, r3.3, sec 4.5.3, has
>> this comment in a sample _OSC implementation:
>>
>> // Only allow native hot plug control if the OS supports:
>> // * ASPM
>> // * Clock PM
>> // * MSI/MSI-X
>>
>> which matches the current ACPI_PCIE_REQ_SUPPORT.
>>

thanks for the reference. So it indicates that native hotplug depends
on ASPM? But maybe not AER or PME, as commented following above sample
in the spec:

// Always allow native PME, AER (no dependencies)

So the AER should work on pcie_aspm=off, does it make sense?

>> So I think I would approach this by reworking the _OSC negotiation so
>> we always advertise "OSC_PCI_ASPM_SUPPORT | OSC_PCI_CLOCK_PM_SUPPORT"
>> if CONFIG_PCIEASPM=y.
>
> That'd be reasonable IMO.
>
>> Advertising support for ASPM doesn't mean Linux has to actually
>> *enable* it, so we could make a different mechanism to prevent use of
>> ASPM if we have a device or platform quirk or we're booting with
>> "pcie_aspm=off".
>

I agree on this and I think this approach can resolve the condition here.
If os got the ASPM control but user sepcified pcie_aspm=off, I think we
can stop the ASPM link configuring to avoid enable the ASPM function.

> Right.
>
> Note that if we don't request the native control of a PCIe feature,
> this basically gives the BIOS a licence to scribble on the related
> device registers and some of the features are not independent, so we
> may need to advertise support for two features in order to get control
> of just one of them.
> .
>

ok. thanks for the note.

Regards,
Yicong