Re: [PATCH] mwifiex: Add quirk resetting the PCI bridge on MS Surface devices

From: Bjorn Helgaas
Date: Mon Oct 11 2021 - 12:53:07 EST


[+cc Alex]

On Mon, Oct 11, 2021 at 03:42:38PM +0200, Jonas Dreßler wrote:
> The most recent firmware (15.68.19.p21) of the 88W8897 PCIe+USB card
> reports a hardcoded LTR value to the system during initialization,
> probably as an (unsuccessful) attempt of the developers to fix firmware
> crashes. This LTR value prevents most of the Microsoft Surface devices
> from entering deep powersaving states (either platform C-State 10 or
> S0ix state), because the exit latency of that state would be higher than
> what the card can tolerate.

S0ix and C-State 10 are ACPI concepts that don't mean anything in a
PCIe context.

I think LTR is only involved in deciding whether to enter the ASPM
L1.2 substate. Maybe the system will only enter C-State 10 or S0ix
when the link is in L1.2?

> Turns out the card works just the same (including the firmware crashes)
> no matter if that hardcoded LTR value is reported or not, so it's kind
> of useless and only prevents us from saving power.
>
> To get rid of those hardcoded LTR requirements, it's possible to reset
> the PCI bridge device after initializing the cards firmware. I'm not
> exactly sure why that works, maybe the power management subsystem of the
> PCH resets its stored LTR values when doing a function level reset of
> the bridge device. Doing the reset once after starting the wifi firmware
> works very well, probably because the firmware only reports that LTR
> value a single time during firmware startup.
>
> Signed-off-by: Jonas Dreßler <verdre@xxxxxxx>
> ---
> drivers/net/wireless/marvell/mwifiex/pcie.c | 12 +++++++++
> .../wireless/marvell/mwifiex/pcie_quirks.c | 26 +++++++++++++------
> .../wireless/marvell/mwifiex/pcie_quirks.h | 1 +
> 3 files changed, 31 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
> index c6ccce426b49..2506e7e49f0c 100644
> --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> @@ -1748,9 +1748,21 @@ mwifiex_pcie_send_boot_cmd(struct mwifiex_adapter *adapter, struct sk_buff *skb)
> static int mwifiex_pcie_init_fw_port(struct mwifiex_adapter *adapter)
> {
> struct pcie_service_card *card = adapter->card;
> + struct pci_dev *pdev = card->dev;
> + struct pci_dev *parent_pdev = pci_upstream_bridge(pdev);
> const struct mwifiex_pcie_card_reg *reg = card->pcie.reg;
> int tx_wrap = card->txbd_wrptr & reg->tx_wrap_mask;
>
> + /* Trigger a function level reset of the PCI bridge device, this makes
> + * the firmware (latest version 15.68.19.p21) of the 88W8897 PCIe+USB
> + * card stop reporting a fixed LTR value that prevents the system from
> + * entering package C10 and S0ix powersaving states.

I don't believe this. Why would resetting the root port change what
the downstream device reports via LTR messages?

>From PCIe r5.0, sec 5.5.1:

The following rules define how the L1.1 and L1.2 substates are entered:
...
* When in ASPM L1.0 and the ASPM L1.2 Enable bit is Set, the L1.2
substate must be entered when CLKREQ# is deasserted and all of
the following conditions are true:

- The reported snooped LTR value last sent or received by this
Port is greater than or equal to the value set by the
LTR_L1.2_THRESHOLD Value and Scale fields, or there is no
snoop service latency requirement.

- The reported non-snooped LTR last sent or received by this
Port value is greater than or equal to the value set by the
LTR_L1.2_THRESHOLD Value and Scale fields, or there is no
non-snoop service latency requirement.

>From the LTR Message format in sec 6.18:

No-Snoop Latency and Snoop Latency: As shown in Figure 6-15, these
fields include a Requirement bit that indicates if the device has a
latency requirement for the given type of Request. If the
Requirement bit is Set, the LatencyValue and LatencyScale fields
describe the latency requirement. If the Requirement bit is Clear,
there is no latency requirement and the LatencyValue and
LatencyScale fields are ignored.

Resetting the root port might make it forget the LTR value it last
received. If that's equivalent to having no service latency
requirement, it *might* enable L1.2 entry, although that doesn't seem
equivalent to the downstream device having sent an LTR message with
the Requirement bit cleared.

I think the endpoint is required to send a new LTR message before it
goes to a non-D0 state (sec 6.18), so the bridge will capture the
latency again, and we'll probably be back in the same state.

This all seems fragile to me. If we force the link to L1.2 without
knowing accurate exit latencies and latency tolerance, the device is
liable to drop packets.

> + * We need to do it here because it must happen after firmware
> + * initialization and this function is called right after that is done.
> + */
> + if (card->quirks & QUIRK_DO_FLR_ON_BRIDGE)
> + pci_reset_function(parent_pdev);

PCIe r5.0, sec 7.5.3.3, says Function Level Reset can only be
supported by endpoints, so I guess this will actually do some other
kind of reset.

> /* Write the RX ring read pointer in to reg->rx_rdptr */
> if (mwifiex_write_reg(adapter, reg->rx_rdptr, card->rxbd_rdptr |
> tx_wrap)) {
> diff --git a/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c b/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
> index 0234cf3c2974..cbf0565353ae 100644
> --- a/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
> +++ b/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
> @@ -27,7 +27,8 @@ static const struct dmi_system_id mwifiex_quirk_table[] = {
> DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
> DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Surface Pro 4"),
> },
> - .driver_data = (void *)QUIRK_FW_RST_D3COLD,
> + .driver_data = (void *)(QUIRK_FW_RST_D3COLD |
> + QUIRK_DO_FLR_ON_BRIDGE),
> },
> {
> .ident = "Surface Pro 5",
> @@ -36,7 +37,8 @@ static const struct dmi_system_id mwifiex_quirk_table[] = {
> DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
> DMI_EXACT_MATCH(DMI_PRODUCT_SKU, "Surface_Pro_1796"),
> },
> - .driver_data = (void *)QUIRK_FW_RST_D3COLD,
> + .driver_data = (void *)(QUIRK_FW_RST_D3COLD |
> + QUIRK_DO_FLR_ON_BRIDGE),
> },
> {
> .ident = "Surface Pro 5 (LTE)",
> @@ -45,7 +47,8 @@ static const struct dmi_system_id mwifiex_quirk_table[] = {
> DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
> DMI_EXACT_MATCH(DMI_PRODUCT_SKU, "Surface_Pro_1807"),
> },
> - .driver_data = (void *)QUIRK_FW_RST_D3COLD,
> + .driver_data = (void *)(QUIRK_FW_RST_D3COLD |
> + QUIRK_DO_FLR_ON_BRIDGE),
> },
> {
> .ident = "Surface Pro 6",
> @@ -53,7 +56,8 @@ static const struct dmi_system_id mwifiex_quirk_table[] = {
> DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
> DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Surface Pro 6"),
> },
> - .driver_data = (void *)QUIRK_FW_RST_D3COLD,
> + .driver_data = (void *)(QUIRK_FW_RST_D3COLD |
> + QUIRK_DO_FLR_ON_BRIDGE),
> },
> {
> .ident = "Surface Book 1",
> @@ -61,7 +65,8 @@ static const struct dmi_system_id mwifiex_quirk_table[] = {
> DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
> DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Surface Book"),
> },
> - .driver_data = (void *)QUIRK_FW_RST_D3COLD,
> + .driver_data = (void *)(QUIRK_FW_RST_D3COLD |
> + QUIRK_DO_FLR_ON_BRIDGE),
> },
> {
> .ident = "Surface Book 2",
> @@ -69,7 +74,8 @@ static const struct dmi_system_id mwifiex_quirk_table[] = {
> DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
> DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Surface Book 2"),
> },
> - .driver_data = (void *)QUIRK_FW_RST_D3COLD,
> + .driver_data = (void *)(QUIRK_FW_RST_D3COLD |
> + QUIRK_DO_FLR_ON_BRIDGE),
> },
> {
> .ident = "Surface Laptop 1",
> @@ -77,7 +83,8 @@ static const struct dmi_system_id mwifiex_quirk_table[] = {
> DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
> DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Surface Laptop"),
> },
> - .driver_data = (void *)QUIRK_FW_RST_D3COLD,
> + .driver_data = (void *)(QUIRK_FW_RST_D3COLD |
> + QUIRK_DO_FLR_ON_BRIDGE),
> },
> {
> .ident = "Surface Laptop 2",
> @@ -85,7 +92,8 @@ static const struct dmi_system_id mwifiex_quirk_table[] = {
> DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
> DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Surface Laptop 2"),
> },
> - .driver_data = (void *)QUIRK_FW_RST_D3COLD,
> + .driver_data = (void *)(QUIRK_FW_RST_D3COLD |
> + QUIRK_DO_FLR_ON_BRIDGE),
> },
> {}
> };
> @@ -103,6 +111,8 @@ void mwifiex_initialize_quirks(struct pcie_service_card *card)
> dev_info(&pdev->dev, "no quirks enabled\n");
> if (card->quirks & QUIRK_FW_RST_D3COLD)
> dev_info(&pdev->dev, "quirk reset_d3cold enabled\n");
> + if (card->quirks & QUIRK_DO_FLR_ON_BRIDGE)
> + dev_info(&pdev->dev, "quirk do_flr_on_bridge enabled\n");
> }
>
> static void mwifiex_pcie_set_power_d3cold(struct pci_dev *pdev)
> diff --git a/drivers/net/wireless/marvell/mwifiex/pcie_quirks.h b/drivers/net/wireless/marvell/mwifiex/pcie_quirks.h
> index 8ec4176d698f..f8d463f4269a 100644
> --- a/drivers/net/wireless/marvell/mwifiex/pcie_quirks.h
> +++ b/drivers/net/wireless/marvell/mwifiex/pcie_quirks.h
> @@ -18,6 +18,7 @@
> #include "pcie.h"
>
> #define QUIRK_FW_RST_D3COLD BIT(0)
> +#define QUIRK_DO_FLR_ON_BRIDGE BIT(1)
>
> void mwifiex_initialize_quirks(struct pcie_service_card *card);
> int mwifiex_pcie_reset_d3cold_quirk(struct pci_dev *pdev);
> --
> 2.31.1
>