Re: [PATCH] PCI: mediatek-gen3: Ignore link up timeout

From: AngeloGioacchino Del Regno

Date: Thu Nov 06 2025 - 04:00:18 EST


Il 06/11/25 06:52, Chen-Yu Tsai ha scritto:
On Wed, Nov 5, 2025 at 7:32 PM AngeloGioacchino Del Regno
<angelogioacchino.delregno@xxxxxxxxxxxxx> wrote:

Il 05/11/25 10:21, Chen-Yu Tsai ha scritto:
On Wed, Nov 5, 2025 at 4:45 PM AngeloGioacchino Del Regno
<angelogioacchino.delregno@xxxxxxxxxxxxx> wrote:

Il 05/11/25 07:28, Chen-Yu Tsai ha scritto:
As mentioned in commit 886a9c134755 ("PCI: dwc: Move link handling into
common code") come up later" in the code, it is possible for link up to
occur later:

Let's standardize this to succeed as there are usecases where devices
(and the link) appear later even without hotplug. For example, a
reconfigured FPGA device.

Another case for this is the new PCIe power control stuff. The power
control mechanism only gets triggered in the PCI core after the driver
calls into pci_host_probe(). The power control framework then triggers
a bus rescan. In most driver implementations, this sequence happens
after link training. If the driver errors out when link training times
out, it will never get to the point where the device gets turned on.

Ignore the link up timeout, and lower the error message down to a
warning.

This makes PCIe devices that have not-always-on power rails work.
However there may be some reversal of PCIe power sequencing, since now
the PERST# and clocks are enabled in the driver, while the power is
applied afterwards.

Signed-off-by: Chen-Yu Tsai <wenst@xxxxxxxxxxxx>

Ok, that's sensible.

Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx>

---
The change works to get my PCIe WiFi device working, but I wonder if
the driver should expose more fine grained controls for the link clock
and PERST# (when it is owned by the controller and not just a GPIO) to
the power control framework. This applies not just to this driver.

The PCI standard says that PERST# should hold the device in reset until
the power rails are valid or stable, i.e. at their designated voltages.

I completely agree with all of the above - and I can imagine multiple PCI-Express
controller drivers doing the same as what's being done in MTK Gen3.

This means that the boot process may get slowed down by the port startup sequence
on multiple PCI-Express controllers (again not just MediaTek) and it's something
that must be resolved in some way... with the fastest course of action imo being
giving controller drivers knowledge of whether there's any device that is expected
to be powered off at that time (in order to at least avoid all those waits that
are expected to fail).

That also requires some refactoring, since all the drivers _wait_ for link
up before going into the PCI core, which does the actual child node parsing.

I would like some input from Bartosz, who introduced the PCI power control
framework, and Manivannan, who added slot power control.

P.S.: Chen-Yu, did you check if the same applies to the MTK previous gen driver?
Could you please check and eventually send a commit to do the same there?

My quick survey last week indicated that all the drivers except for the
dwc family error out if link up timed out.

I don't have any hardware for the older generation though. And it looks
like for the previous gen, the driver performs even worse, since it can
support multiple slots, and each slot is brought up sequentially. A slot
is discarded if link up times out. And the whole driver errors out if no
slots are working.


Hey, that's bold.

If only one driver (DWC) is working okay, there's something wrong that must be
fixed before that behavior change goes upstream (which it already did, ugh).

To be fair one only runs into it if they convert over to the PCI slot power
description in the device tree, and their hardware isn't DWC based. This
is pretty new.


That changes a lot of things then - I thought it was a regression, but it's not.

This needs attention from both Bartosz and Mani really-right-now.

I'm not sure about possible good solutions, and unfortunately I don't really have
any time to explore, so I'm not spitting any words on that - leaving this to both
Bartosz and Mani as that's also the right thing to do anyway.

Mani mentioned [1] that work towards moving the pwrctrl stuff into drivers
is almost complete. So I think we're covered.


We're covered. Yes.

Cheers,
Angelo

ChenYu

[1] https://lore.kernel.org/all/rz6ajnl7l25hfl2u7lloywtw7sq7smhb63hg76wjslyuwyjb7a@fhuafuino5kv/