Re: [PATCH net] pds_core: Fix pdsc_check_pci_health function to print warning

From: Brett Creeley
Date: Tue Apr 02 2024 - 13:07:51 EST


On 3/22/2024 6:02 PM, Jakub Kicinski wrote:
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.


On Wed, 20 Mar 2024 23:39:54 -0700 Brett Creeley wrote:
When the driver notices fw_status == 0xff it tries to perform a PCI
reset on itself via pci_reset_function() in the context of the driver's
health thread. However, pdsc_reset_prepare calls
pdsc_stop_health_thread(), which attempts to stop/flush the health
thread. This results in a deadlock because the stop/flush will never
complete since the driver called pci_reset_function() from the health
thread context. Fix this by changing the pdsc_check_pci_health_function()
to print a dev_warn() once every fw_down/fw_up cycle and requiring the
user to perform a reset on the device via sysfs's reset interface,
reloading the driver, rebinding the device, etc.

Dunno, to call PCI reset you don't need much device context.
Perhaps you could allocate a work entry, throw it onto a per-driver
workqueue, and return. Basically some minimal viable way to
"asynchronously" call pci_reset_function()?
You can take a ref on the device so it doesn't disappear.
And flush that queue on module unload.

Hi Jakub,

Yeah, this is better than my proposed solution. Now that I'm back from vacation I will work on a v2.

Thanks for the review,

Brett