Re: [PATCH v2 2/3] PCI: brcmstb: CLKREQ# accomodations of downstream device

From: Florian Fainelli
Date: Fri Apr 14 2023 - 16:33:38 EST


On 4/14/23 13:27, Bjorn Helgaas wrote:
This subject line no verb. Can you add a leading verb to suggest what
this patch does?

s/accomodations/accommodations/

On Tue, Apr 11, 2023 at 12:59:17PM -0400, Jim Quinlan wrote:
The Broadcom STB/CM PCIe HW core, which is also used in RPi SOCs, must be
deliberately set by the probe() into one of three mutually exclusive modes:

(a) No CLKREQ# expected or required, refclk is always available.
(b) CLKREQ# is expected to be driven by downstream device when needed.
(c) Bidirectional CLKREQ# for L1SS capable devices.

Previously, only (b) was supported by the driver, as almost all STB/CM
boards operate in this mode. But now there is interest in activating L1SS
power savings from STB/CM customers, and also interest in accomodating mode
(a) for designs such as the RPi CM4 with IO board.

accommodating

The HW+driver is able to tell us when mode (a) mode is needed. But there
is no easy way to tell if L1SS mode should be configured. In certain
situations, getting this wrong may cause a panic during boot time. So we
rely on the DT prop "brcm,enable-l1ss" to tell us when mode (c) is desired.
Using this mode only makes sense when the downstream device is L1SS-capable
and the OS has been configured to activate L1SS
(e.g. policy==powersupersave).

I'm really concerned about the user experience here. I assume users
do not want to edit the DT based on what device they plug in. They
shouldn't need to (and probably won't) know whether the device
supports L1SS.

I hate kernel/module parameters, but I think even that would be better
then having to edit the DT.

The problem I see with kernel/module parameters is that it is really hard to differentiate whether they should be applied across all instances of the device/drivers or just for one in particular.

The Raspberry Pi 4 is a single pcie-brcmstb instance, but we have other systems supported by that driver on Set-top-box and Cable Modem chips for instance where we may have different types of end-points being connected, some of which could be Multi-chip-module (MCM) where everything is known ahead of time, and sometimes cards that are plugged to full-sized PCIe or mini-PCIe connectors, where some amount of runtime discoverability is involved.

Without inventing some custom modular parameter syntax, it may not work that well. The Device Tree properties at least easily allow for per-instance configuration.


There's obviously a period of time when L1SS is supported but not yet
enabled, so I'm *guessing* the "OS has been configured to activate
L1SS" is not actually a requirement, and choosing (c) really just
opens the possibility that L1SS can be used?

Would be nice to have a hint (maybe a line or two of the panic
message) to help users find the fix for a problem they're seeing.

Obviously the ideal would be if we could use (c) in all cases, so I
assume that's where a panic might happen. What situation would that
be? An endpoint that doesn't support L1SS? One that supports L1SS
but it's not enabled? Maybe if L1SS isn't configured correctly, e.g.,
LTR values programmed wrong?

Bjorn

--
Florian