On 2019/1/9 10:03, Jia-Ju Bai wrote:
There are 3 modes in forcedeth NIC.
On 2019/1/9 9:24, Yanjun Zhu wrote:
On 2019/1/8 20:57, Jia-Ju Bai wrote:
On 2019/1/8 20:54, Zhu Yanjun wrote:
å 2019/1/8 20:45, Jia-Ju Bai åé:
In drivers/net/ethernet/nvidia/forcedeth.c, the functions
nv_start_xmit() and nv_start_xmit_optimized() can be concurrently
executed with nv_poll_controller().
nv_start_xmit
line 2321: prev_tx_ctx->skb = skb;
nv_start_xmit_optimized
line 2479: prev_tx_ctx->skb = skb;
nv_poll_controller
nv_do_nic_poll
line 4134: spin_lock(&np->lock);
nv_drain_rxtx
nv_drain_tx
nv_release_txskb
line 2004: dev_kfree_skb_any(tx_skb->skb);
Thus, two possible concurrency use-after-free bugs may occur.
To fix these possible bugs,
Does this really occur? Can you reproduce this ?
This bug is not found by the real execution.
It is found by a static tool written by myself, and then I check it by manual code review.
Before "line 2004: dev_kfree_skb_any(tx_skb->skb); ",
"
nv_disable_irq(dev);
nv_napi_disable(dev);
netif_tx_lock_bh(dev);
netif_addr_lock(dev);
spin_lock(&np->lock);
/* stop engines */
nv_stop_rxtx(dev); <---this stop rxtx
nv_txrx_reset(dev);
"
In this case, does nv_start_xmit or nv_start_xmit_optimized still work well?
nv_stop_rxtx() calls nv_stop_tx(dev).
static void nv_stop_tx(struct net_device *dev)
{
struct fe_priv *np = netdev_priv(dev);
u8 __iomem *base = get_hwbase(dev);
u32 tx_ctrl = readl(base + NvRegTransmitterControl);
if (!np->mac_in_use)
tx_ctrl &= ~NVREG_XMITCTL_START;
else
tx_ctrl |= NVREG_XMITCTL_TX_PATH_EN;
writel(tx_ctrl, base + NvRegTransmitterControl);
if (reg_delay(dev, NvRegTransmitterStatus, NVREG_XMITSTAT_BUSY, 0,
NV_TXSTOP_DELAY1, NV_TXSTOP_DELAY1MAX))
netdev_info(dev, "%s: TransmitterStatus remained busy\n",
__func__);
udelay(NV_TXSTOP_DELAY2);
if (!np->mac_in_use)
writel(readl(base + NvRegTransmitPoll) & NVREG_TRANSMITPOLL_MAC_ADDR_REV,
base + NvRegTransmitPoll);
}
nv_stop_tx() seems to only write registers to stop transmitting for hardware.
But it does not wait until nv_start_xmit() and nv_start_xmit_optimized() finish execution.
In throughput mode (0), every tx & rx packet will generate an interrupt.
In CPU mode (1), interrupts are controlled by a timer.
In dynamic mode (2), the mode toggles between throughput and CPU mode based on network load.
From the source code,
"np->recover_error = 1;" is related with CPU mode.
nv_start_xmit or nv_start_xmit_optimized seems related with ghroughput mode.
In static void nv_do_nic_poll(struct timer_list *t),
when if (np->recover_error), line 2004: dev_kfree_skb_any(tx_skb->skb); will run.
When "np->recover_error=1", do you think nv_start_xmit or nv_start_xmit_optimized will be called?