Re: [REPORT] net: 3com: 3c59x: Possible data races

From: Sebastian Andrzej Siewior
Date: Fri Oct 05 2018 - 11:04:54 EST


On 2018-10-03 21:52:14 [+0800], Jia-Ju Bai wrote:
> ****** Possible race0 ******
> CPU0:
> vortex_boomerang_interrupt
> line 2510: spin_lock_irqsave()
> _boomerang_interrupt
> line 2432: vp->tx_skbuff[entry] [READ]
> line 2433: vp->tx_skbuff[entry] [READ]
> line 2453: vp->tx_skbuff[entry] = NULL [WRITE]
>
> CPU1:
> boomerang_start_xmit
> line 2145: vp->tx_skbuff[entry] = skb [WRITE]
>
> As for vp->tx_skbuff[entry], the WRITE and READ operations in CPU0
> are performed with holding a spinlock, but the WRITE operation in CPU1
> is performed without holding this spinlock, so there may exist data races.

not really. One side fills the ring buffer (cur_tx++) and the other side
purges it (dirty_tx++). Don't see a overlap here.

> ****** Possible race1 ******
> CPU0:
> vortex_boomerang_interrupt
> line 2510: spin_lock_irqsave()
> _boomerang_interrupt
> line 2421: vp->dirty_tx = dirty_tx [WRITE]
>
> CPU1:
> boomerang_start_xmit
> line 2137: vp->dirty_tx [READ]
>
> As for vp->dirty_tx, the WRITE operation in CPU0 is performed
> with holding a spinlock, but the READ operation in CPU1 is performed
> without holding this spinlock, so there may exist a data race.

basically the same thing as race0

> ****** Possible race2 ******
> CPU0:
> vortex_boomerang_interrupt
> line 2510: spin_lock_irqsave()
> _boomerang_interrupt
> line 2381: vp->handling_irq = 1 [WRITE]
> line 2498: vp->handling_irq = 0 [WRITE]
>
> CPU1:
> boomerang_start_xmit
> line 2134: vp->handling_irq [READ]
>
> As for vp->handling_irq, the WRITE operations in CPU0 are performed
> with holding a spinlock, but the READ operation in CPU1 is performed
> without holding this spinlock, so there may exist data races.

It might but it is not serious. As the comment explains, if CPU1 would
miss to then it would spin the lock while CPU0 would deadlock.
However, I doubt that this happens because the core should hold the
queue lock.

> ****** Possible race3 ******
> CPU0:
> vortex_boomerang_interrupt
> line 2510: spin_lock_irqsave()
> _boomerang_interrupt
> boomerang_rx
> line 2669: skb->ip_summed = ... [WRITE]
>
> CPU1:
> boomerang_start_xmit
> line 2149: skb->ip_summed [READ]
>
> As for skb->ip_summed, the WRITE operation in CPU0 is performed
> with holding a spinlock, but the READ operation in CPU1 is performed
> without holding this spinlock, so there may exist data races.

This race would exist if the skb would be simultaneously on RX and TX
side. Otherwise we good.

>
> These possible races are detected by a runtime testing.

Could you please define runtime testing? Did something happen or did you
just instrument the code and assumed that it might happen?cG

> A possible fix of these races is protecting the code in
> boomerang_start_xmit()
> using the spinlock in vortex_boomerang_interrupt().
> But I am not sure whether this fix is correct, so I only report these races.

Okay. As explained above, I don't think the races are real.

> Best wishes,
> Jia-Ju Bai

Sebastian