Re: [PATCH 1/2] ixgbe: release lock for the duration of ixgbe_suspend_close()

From: Alexander Duyck
Date: Thu May 03 2018 - 09:46:16 EST

Next message: Pavel Tatashin: "Re: [per_cpu_ptr_to_phys] PANIC: early exception 0x0d IP 10:ffffffffa892f15f error 0 cr2 0xffff88001fbff000"
Previous message: Joerg Roedel: "Re: [PATCH v4 1/2] iommu - Enable debugfs exposure of the IOMMU"
In reply to: Pavel Tatashin: "Re: [PATCH 1/2] ixgbe: release lock for the duration of ixgbe_suspend_close()"
Next in thread: Steven Sistare: "Re: [PATCH 1/2] ixgbe: release lock for the duration of ixgbe_suspend_close()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, May 3, 2018 at 6:30 AM, Pavel Tatashin
<pasha.tatashin@xxxxxxxxxx> wrote:
> Hi Alex,

Hi Pavel,

>> I'm not a fan of dropping the mutex while we go through
>> ixgbe_close_suspend. I'm concerned it will result in us having a
>> number of races on shutdown.
>
> I would agree, but ixgbe_close_suspend() is already called without this
> mutex from ixgbe_close(). This path is executed also during machine
> shutdown but when kexec'ed. So, it is either an existing bug or there are
> no races. But, because ixgbe_close() is called from the userland, and a
> little earlier than ixgbe_shutdown() I think this means there are no races.

All callers of ixgbe_close are supposed to already be holding the RTNL
mutex. That is the whole reason why we need to hold it here as the
shutdown path doesn't have the mutex held otherwise. The mutex was
added here because shutdown was racing with the open/close calls.

>
>> If anything, I think we would need to find a replacement for the mutex
>> that can operate at the per-netdev level if you are wanting to
>> parallelize this.
>
> Yes, if lock is necessary, it can be replaced in this place (and added to
> ixgbe_close()) with something scalable.

I wouldn't suggest adding yet another lock for all this. Instead it
might make more sense for us to start looking at breaking up the
locking since most of the networking subsystem uses the rtnl_lock call
it might be time to start looking at doing something like cutting back
on the use of this in cases where all the resources needed are
specific to one device and instead have a per device lock that could
address those kind of cases.

- Alex

Next message: Pavel Tatashin: "Re: [per_cpu_ptr_to_phys] PANIC: early exception 0x0d IP 10:ffffffffa892f15f error 0 cr2 0xffff88001fbff000"
Previous message: Joerg Roedel: "Re: [PATCH v4 1/2] iommu - Enable debugfs exposure of the IOMMU"
In reply to: Pavel Tatashin: "Re: [PATCH 1/2] ixgbe: release lock for the duration of ixgbe_suspend_close()"
Next in thread: Steven Sistare: "Re: [PATCH 1/2] ixgbe: release lock for the duration of ixgbe_suspend_close()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]