Re: [PATCH] Fix race condition in enc28j60 driver
From: David Miller
Date: Sat Jul 02 2016 - 14:57:48 EST
From: Sergio Valverde < sergio.valverde@xxxxxxx >
Date: Fri, 1 Jul 2016 11:44:30 -0600
> From: Sergio Valverde <sergio.valverde@xxxxxxx>
>
> The interrupt worker code for the enc28j60 relies only on the TXIF flag to
> determinate if the packet transmission was completed. However the datasheet
> specifies in section 12.1.3 that TXERIF will clear the TXRTS after a
> transmit abort. Also in section 12.1.4 that TXIF will be set
> when TXRTS transitions from '1' to '0'. Therefore the TXIF flag is enabled
> during transmission errors.
>
> This causes a race condition, since the worker code will invoke
> enc28j60_tx_clear() -> netif_wake_queue(), potentially invoking the
> ndo_start_xmit function to send a new packet. The enc28j60_send_packet function
> uses a workqueue that invokes enc28j60_hw_tx(). In between this function is
> called, the worker from the interrupt handler will enter the path for error
> handler because of the TXERIF flag, causing to invoke enc28j60_tx_clear() again
> and releasing the packet scheduled for transmission, causing a kernel crash with
> due a NULL pointer.
>
> These crashes due a NULL pointer were observed under stress conditions of the
> device. A BUG_ON() sequence was used to validate the issue was fixed, and has
> been running without problems for 2 years now.
>
> Signed-off-by: Diego Dompe <dompe@xxxxxxx>
> Acked-by: Sergio Valverde <sergio.valverde@xxxxxxx>
Applied.