Re: [PATCH 2/2] net: ethernet: stmmac: remove private tx queue lock

From: Lino Sanfilippo
Date: Wed Dec 07 2016 - 17:34:47 EST


On 07.12.2016 22:43, Lino Sanfilippo wrote:
> Hi Pavel,
>
> On 07.12.2016 22:37, Pavel Machek wrote:
>> On Wed 2016-12-07 21:05:38, Lino Sanfilippo wrote:
>>> The driver uses a private lock for synchronization between the xmit
>>> function and the xmit completion handler, but since the NETIF_F_LLTX flag
>>> is not set, the xmit function is also called with the xmit_lock held.
>>>
>>> On the other hand the xmit completion handler first takes the private lock
>>> and (in case that the tx queue has been stopped) the xmit_lock, leading to
>>> a reverse locking order and the potential danger of a deadlock.
>>>
>>> Fix this by removing the private lock completely and synchronizing the xmit
>>> function and completion handler solely by means of the xmit_lock. By doing
>>> this remove also the now unnecessary double check for a stopped tx queue.
>>>
>>
>> FYI, here's modified version. I believe _bh versions are needed, and
>> I'm testing that version now. (Oh and I also ported it to net-next).
>>
>> It survived 30 minutes of testing so far...
>>
>
> First off, thanks for testing.
> Hmm. I dont understand why _bh would be needed. We call that function from
> BH context only (napi poll and timer).
> Any idea?
>

Could this once again be caused by irq coalescing? When the tx queue has been stopped
the cleanup handler has to wakeup the queue within a certain time span, otherwise the
watchdog will complain (as it happened in your test). Could you retest this with
irq coalescing disabled?