Re: [PATCH v2 3/7] [PATCH 3/8] can: CAN Network device driver andNetlink interface

From: Wolfgang Grandegger
Date: Wed May 13 2009 - 08:24:10 EST


Oliver Hartkopp wrote:
> Wolfgang Grandegger wrote:
>> Oliver Hartkopp wrote:
>>
>>> Andrew Morton wrote:
>>>
>>>> On Tue, 12 May 2009 11:28:00 +0200 Wolfgang Grandegger
>>>> <wg@xxxxxxxxxxxxxx> wrote:
>>>>
>>>>
>>>>
>>>>> +int can_restart_now(struct net_device *dev)
>>>>> +{
>>>>> + struct can_priv *priv = netdev_priv(dev);
>>>>> + struct net_device_stats *stats = &dev->stats;
>>>>> + struct sk_buff *skb;
>>>>> + struct can_frame *cf;
>>>>> + int err;
>>>>> +
>>>>> + /* Synchronize with dev->hard_start_xmit() */
>>>>> + netif_tx_lock(dev);
>>>>> +
>>>>> + /* Ensure that no more messages can go out */
>>>>> + if (netif_carrier_ok(dev))
>>>>> + netif_carrier_off(dev);
>>>>> +
>>>>> + /* Ensure that no more messages can come in */
>>>>> + err = priv->do_set_mode(dev, CAN_MODE_STOP);
>>>>> + if (err)
>>>>> + return err;
>>>>> +
>>>>> + /* Now it's save to clean up */
>>>>> + del_timer_sync(&priv->restart_timer);
>>>>>
>>>> This is deadlockable.
>>>>
>>>> It calls del_timer_sync() while holding netif_tx_lock(). But the timer
>>>> handler (can_restart_now()) also takes netif_tx_lock(). So if the
>>>> timer handler is presently running, it's sitting there spinning in
>>>> netif_tx_lock(). And del_timer_sync() is sitting there waiting for the
>>>> timer handler to complete.
>>>>
>>>>
>>>>
>>> Hi Wolfgang,
>>>
>>> would it be an appropriate solution, just to invoke
>>>
>>> netif_stop_queue() in can_bus_off()
>>>
>>> and invoke
>>>
>>> netif_wake_queue() in can_restart_now()
>>>
>>> ???
>>>
>>> In a BUSOFF condition we're not able to send CAN frames anyway, so we
>>> can disable the device queue and the we won't need any netif_tx_lock()
>>> right?
>>>
>>> AFAIK this was the original implementation before some of the latest
>>> improvement with the netif_carrier stuff.
>>>
>>> What do you think?
>>>
>>
>> The problem is the "manual" restart triggered via the netlink interface,
>> which can occur in the middle of ndo_start_xmit().
>>
>>
>
> Ah, i see.
>
> What if the manual restart via netlink would also stop the queue and
> start the timer?

It will not help if the restart is triggered in the middle of
ndo_start_xmit().

Wolfgang.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/