Re: [PATCH v4 1/3] PCI: pciehp: Add support for async hotplug with native AER and DPC/EDR

From: Smita Koralahalli
Date: Tue Sep 12 2023 - 18:45:43 EST


On 8/28/2023 12:35 AM, Lukas Wunner wrote:
On Tue, Aug 15, 2023 at 09:20:41PM +0000, Smita Koralahalli wrote:
According to PCIe r6.0 sec 6.7.6 [1], async removal with DPC may result in
surprise down error. This error is expected and is just a side-effect of
async remove.

Add support to handle the surprise down error generated as a side-effect
of async remove. Typically, this error is benign as the pciehp handler
invoked by PDC or/and DLLSC alongside DPC, de-enumerates and brings down
the device appropriately. But the error messages might confuse users. Get
rid of these irritating log messages with a 1s delay while pciehp waits
for dpc recovery.
[...]
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@xxxxxxx>

Reviewed-by: Lukas Wunner <lukas@xxxxxxxxx>

The subject is slightly inaccurate as this doesn't touch pciehp source
files, although it is *related* to pciehp.

As an example, a perhaps more accurate subject might be something like...

PCI/DPC: Ignore Surprise Down errors on hot removal

...but I don't think it's necessary to respin just for that as Bjorn is
probably able to adjust the subject to his liking when applying the patch.

Thanks a lot for patiently pursuing this issue, good to see it fixed.

Five years ago there was an attempt to solve it through masking Surprise
Down errors, which you've verified to not be a viable approach:

https://patchwork.kernel.org/project/linux-pci/patch/20180818065126.77912-2-okaya@xxxxxxxxxx/

Lukas

Thanks for the review. Would it be possible to consider this patch as a standalone while I work on 10-bit tags enumeration? I can do v5 for this patch with $SUBJECT changes and also include clearing Atomic Ops and 10-bit tags unconditionally on hot-remove if that works..

Thanks,
Smita