Re: [PATCH V2] PCI: AER: fix deadlock in do_recovery

From: Govindarajulu Varadarajan
Date: Mon Oct 02 2017 - 20:20:06 EST


On Sat, 30 Sep 2017, Sinan Kaya wrote:

On 9/30/2017 1:49 AM, Govindarajulu Varadarajan wrote:
This patch does a pci_bus_walk and adds all the devices to a list. After
unlocking (up_read) &pci_bus_sem, we go through the list and call
err_handler of the devices with devic_lock() held. This way, we dont try
to hold both locks at same time.

I do like this approach with some more feedback.

I need a little bit of help here from someone that knows get/put device calls.

I understand get_device() and put_device() are there to increment/decrement
reference counters. This patch seems to use them as an alternative for device_lock()
and device_unlock() API.


get/put_deivce is not used as an alternative to device_lock() here. We are
incrementing the counter to make sure no one else free the device while they
are in our list. report_error_detected() and other cb will still acquire
device_lock().


If this is a good assumption, then you can get away with just replacing device_lock()
with get_device() and device_unlock() with put_device() in the existing code as
well. Then, you don't need to build a linklist.