Re: [PATCH net-next] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc

From: Yunsheng Lin
Date: Tue Sep 01 2020 - 21:42:50 EST


On 2020/9/2 2:24, Cong Wang wrote:
> On Mon, Aug 31, 2020 at 5:59 PM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote:
>>
>> Currently there is concurrent reset and enqueue operation for the
>> same lockless qdisc when there is no lock to synchronize the
>> q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
>> qdisc_deactivate() called by dev_deactivate_queue(), which may cause
>> out-of-bounds access for priv->ring[] in hns3 driver if user has
>> requested a smaller queue num when __dev_xmit_skb() still enqueue a
>> skb with a larger queue_mapping after the corresponding qdisc is
>> reset, and call hns3_nic_net_xmit() with that skb later.
>
> Can you be more specific here? Which call path requests a smaller
> tx queue num? If you mean netif_set_real_num_tx_queues(), clearly
> we already have a synchronize_net() there.

When the netdevice is in active state, the synchronize_net() seems to
do the correct work, as below:

CPU 0: CPU1:
__dev_queue_xmit() netif_set_real_num_tx_queues()
rcu_read_lock_bh();
netdev_core_pick_tx(dev, skb, sb_dev);
.
. dev->real_num_tx_queues = txq;
. .
. .
. synchronize_net();
. .
q->enqueue() .
. .
rcu_read_unlock_bh() .
qdisc_reset_all_tx_gt


but dev->real_num_tx_queues is not RCU-protected, maybe that is a problem
too.

The problem we hit is as below:
In hns3_set_channels(), hns3_reset_notify(h, HNAE3_DOWN_CLIENT) is called
to deactive the netdevice when user requested a smaller queue num, and
txq->qdisc is already changed to noop_qdisc when calling
netif_set_real_num_tx_queues(), so the synchronize_net() in the function
netif_set_real_num_tx_queues() does not help here.

>
>>
>> Avoid the above concurrent op by calling synchronize_rcu_tasks()
>> after assigning new qdisc to dev_queue->qdisc and before calling
>> qdisc_deactivate() to make sure skb with larger queue_mapping
>> enqueued to old qdisc will always be reset when qdisc_deactivate()
>> is called.
>
> Like Eric said, it is not nice to call such a blocking function when
> we have a large number of TX queues. Possibly we just need to
> add a synchronize_net() as in netif_set_real_num_tx_queues(),
> if it is missing.

As above, the synchronize_net() in netif_set_real_num_tx_queues() seems
to work when netdevice is in active state, but does not work when in
deactive.

And we do not want skb left in the old qdisc when netdevice is deactived,
right?

As reply to Eric, maybe the existing synchronize_net() in dev_deactivate_many()
can be reused to order the qdisc assignment and qdisc reset?

>
> Thanks.
> .
>