Re: [PATCH net-next] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
From: Cong Wang
Date: Wed Sep 02 2020 - 00:41:52 EST
On Tue, Sep 1, 2020 at 6:42 PM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote:
>
> On 2020/9/2 2:24, Cong Wang wrote:
> > On Mon, Aug 31, 2020 at 5:59 PM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote:
> >>
> >> Currently there is concurrent reset and enqueue operation for the
> >> same lockless qdisc when there is no lock to synchronize the
> >> q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
> >> qdisc_deactivate() called by dev_deactivate_queue(), which may cause
> >> out-of-bounds access for priv->ring[] in hns3 driver if user has
> >> requested a smaller queue num when __dev_xmit_skb() still enqueue a
> >> skb with a larger queue_mapping after the corresponding qdisc is
> >> reset, and call hns3_nic_net_xmit() with that skb later.
> >
> > Can you be more specific here? Which call path requests a smaller
> > tx queue num? If you mean netif_set_real_num_tx_queues(), clearly
> > we already have a synchronize_net() there.
>
> When the netdevice is in active state, the synchronize_net() seems to
> do the correct work, as below:
>
> CPU 0: CPU1:
> __dev_queue_xmit() netif_set_real_num_tx_queues()
> rcu_read_lock_bh();
> netdev_core_pick_tx(dev, skb, sb_dev);
> .
> . dev->real_num_tx_queues = txq;
> . .
> . .
> . synchronize_net();
> . .
> q->enqueue() .
> . .
> rcu_read_unlock_bh() .
> qdisc_reset_all_tx_gt
>
>
Right.
> but dev->real_num_tx_queues is not RCU-protected, maybe that is a problem
> too.
>
> The problem we hit is as below:
> In hns3_set_channels(), hns3_reset_notify(h, HNAE3_DOWN_CLIENT) is called
> to deactive the netdevice when user requested a smaller queue num, and
> txq->qdisc is already changed to noop_qdisc when calling
> netif_set_real_num_tx_queues(), so the synchronize_net() in the function
> netif_set_real_num_tx_queues() does not help here.
How could qdisc still be running after deactivating the device?
>
> >
> >>
> >> Avoid the above concurrent op by calling synchronize_rcu_tasks()
> >> after assigning new qdisc to dev_queue->qdisc and before calling
> >> qdisc_deactivate() to make sure skb with larger queue_mapping
> >> enqueued to old qdisc will always be reset when qdisc_deactivate()
> >> is called.
> >
> > Like Eric said, it is not nice to call such a blocking function when
> > we have a large number of TX queues. Possibly we just need to
> > add a synchronize_net() as in netif_set_real_num_tx_queues(),
> > if it is missing.
>
> As above, the synchronize_net() in netif_set_real_num_tx_queues() seems
> to work when netdevice is in active state, but does not work when in
> deactive.
Please explain why deactivated device still has qdisc running?
At least before commit 379349e9bc3b4, we always test deactivate
bit before enqueueing. Are you complaining about that commit?
That commit is indeed suspicious, at least it does not precisely revert
commit ba27b4cdaaa66561aaedb21 as it claims.
>
> And we do not want skb left in the old qdisc when netdevice is deactived,
> right?
Yes, and more importantly, qdisc should not be running after deactivation.
Thanks.