Re: [PATCH] PCI/portdrv: Avoid enabling AER on Thunderbolt devices

From: Bagas Sanjaya
Date: Tue May 16 2023 - 10:15:18 EST


On Mon, Dec 26, 2022 at 11:30:31PM +0800, Kai-Heng Feng wrote:
> We are seeing igc ethernet device on Thunderbolt dock stops working
> after S3 resume because of AER error, or even make S3 resume freeze:
> pcieport 0000:00:1d.0: AER: Multiple Corrected error received: 0000:00:1d.0
> pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Transaction Layer, (Receiver ID)
> pcieport 0000:00:1d.0: device [8086:7ab0] error status/mask=00008000/00002000
> pcieport 0000:00:1d.0: [15] HeaderOF
> pcieport 0000:00:1d.0: AER: Multiple Uncorrected (Non-Fatal) error received: 0000:00:1d.0
> pcieport 0000:00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
> pcieport 0000:00:1d.0: device [8086:7ab0] error status/mask=00100000/00004000
> pcieport 0000:00:1d.0: [20] UnsupReq (First)
> pcieport 0000:00:1d.0: AER: TLP Header: 34000000 0a000052 00000000 00000000
> pcieport 0000:00:1d.0: AER: Error of this Agent is reported first
> pcieport 0000:04:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
> pcieport 0000:04:01.0: device [8086:1136] error status/mask=00300000/00000000
> pcieport 0000:04:01.0: [20] UnsupReq (First)
> pcieport 0000:04:01.0: [21] ACSViol
> pcieport 0000:04:01.0: AER: TLP Header: 34000000 04000052 00000000 00000000
> thunderbolt 0000:05:00.0: AER: can't recover (no error_detected callback)
>
> This supposedly should be fixed by commit c01163dbd1b8 ("PCI/PM: Always disable
> PTM for all devices during suspend"), but somehow it doesn't work for
> this case.
>
> By dumping the PCI_PTM_CTRL register on resume, it turns out PTM is
> already flipped on by either the Thunderbolt dock firmware or the host
> BIOS. Writing 0 to PCI_PTM_CTRL yields the same result.
>
> Windows is however not affected by this issue, by using WinDbg's !pci
> command, it shows that AER is not enabled for devices connected via
> Thunderbolt port, and that's the reason why Windows doesn't exhibit the
> issue.
>
> So turn a blind eye on external Thunderbolt devices like Windows does by
> disabling AER.
>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216850
> Cc: Mario Limonciello <mario.limonciello@xxxxxxx>
> Cc: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx>
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@xxxxxxxxxxxxx>

Hi,

I noticed a similar regression on bugzilla [1] where I asked the
reporter to test your patch, and his regression still occured. For
full details, see bugzilla.

Thanks.

Reported-by: Pengyu Ma <mapengyu@xxxxxxxxx>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217446 [1]

--
An old man doll... just what I always wanted! - Clara

Attachment: signature.asc
Description: PGP signature