Re: [PATCH net v3] tg3: Disable tg3 PCIe AER on system reboot

From: Simon Horman
Date: Fri Jan 31 2025 - 04:42:25 EST


On Thu, Jan 30, 2025 at 04:57:54PM -0500, Lenny Szubowicz wrote:
> Disable PCIe AER on the tg3 device on system reboot on a limited
> list of Dell PowerEdge systems. This prevents a fatal PCIe AER event
> on the tg3 device during the ACPI _PTS (prepare to sleep) method for
> S5 on those systems. The _PTS is invoked by acpi_enter_sleep_state_prep()
> as part of the kernel's reboot sequence as a result of commit
> 38f34dba806a ("PM: ACPI: reboot: Reinstate S5 for reboot").
>
> There was an earlier fix for this problem by commit 2ca1c94ce0b6
> ("tg3: Disable tg3 device on system reboot to avoid triggering AER").
> But it was discovered that this earlier fix caused a reboot hang
> when some Dell PowerEdge servers were booted via ipxe. To address
> this reboot hang, the earlier fix was essentially reverted by commit
> 9fc3bc764334 ("tg3: power down device only on SYSTEM_POWER_OFF").
> This re-exposed the tg3 PCIe AER on reboot problem.
>
> This fix is not an ideal solution because the root cause of the AER
> is in system firmware. Instead, it's a targeted work-around in the
> tg3 driver.
>
> Note also that the PCIe AER must be disabled on the tg3 device even
> if the system is configured to use "firmware first" error handling.
>
> V3:
> - Fix sparse warning on improper comparison of pdev->current_state
> - Adhere to netdev comment style
>
> Fixes: 9fc3bc764334 ("tg3: power down device only on SYSTEM_POWER_OFF")
> Signed-off-by: Lenny Szubowicz <lszubowi@xxxxxxxxxx>

Reviewed-by: Simon Horman <horms@xxxxxxxxxx>

Hi Lenny,

For future reference, please post new versions of patches to netdev
in new email threads.

Ref: https://docs.kernel.org/process/maintainer-netdev.html#resending-after-review