Re: [PATCH net v3] net: sched: fix packet stuck problem for lockless qdisc

From: Yunsheng Lin
Date: Tue Apr 13 2021 - 05:04:40 EST


On 2021/4/13 16:33, Hillf Danton wrote:
> On Tue, 13 Apr 2021 15:57:29 Yunsheng Lin wrote:
>> On 2021/4/13 15:12, Hillf Danton wrote:
>>> On Tue, 13 Apr 2021 11:34:27 Yunsheng Lin wrote:
>>>> On 2021/4/13 11:26, Hillf Danton wrote:
>>>>> On Tue, 13 Apr 2021 10:56:42 Yunsheng Lin wrote:
>>>>>> On 2021/4/13 10:21, Hillf Danton wrote:
>>>>>>> On Mon, 12 Apr 2021 20:00:43 Yunsheng Lin wrote:
>>>>>>>>
>>>>>>>> Yes, the below patch seems to fix the data race described in
>>>>>>>> the commit log.
>>>>>>>> Then what is the difference between my patch and your patch below:)
>>>>>>>
>>>>>>> Hehe, this is one of the tough questions over a bounch of weeks.
>>>>>>>
>>>>>>> If a seqcount can detect the race between skb enqueue and dequeue then we
>>>>>>> cant see any excuse for not rolling back to the point without NOLOCK.
>>>>>>
>>>>>> I am not sure I understood what you meant above.
>>>>>>
>>>>>> As my understanding, the below patch is essentially the same as
>>>>>> your previous patch, the only difference I see is it uses qdisc->pad
>>>>>> instead of __QDISC_STATE_NEED_RESCHEDULE.
>>>>>>
>>>>>> So instead of proposing another patch, it would be better if you
>>>>>> comment on my patch, and make improvement upon that.
>>>>>>
>>>>> Happy to do that after you show how it helps revert NOLOCK.
>>>>
>>>> Actually I am not going to revert NOLOCK, but add optimization
>>>> to it if the patch fixes the packet stuck problem.
>>>>
>>> Fix is not optimization, right?
>>
>> For this patch, it is a fix.
>> In case you missed it, I do have a couple of idea to optimize the
>> lockless qdisc:
>>
>> 1. RFC patch to add lockless qdisc bypass optimization:
>>
>> https://patchwork.kernel.org/project/netdevbpf/patch/1616404156-11772-1-git-send-email-linyunsheng@xxxxxxxxxx/
>>
>> 2. implement lockless enqueuing for lockless qdisc using the idea
>> from Jason and Toke. And it has a noticable proformance increase with
>> 1-4 threads running using the below prototype based on ptr_ring.
>>
>> static inline int __ptr_ring_multi_produce(struct ptr_ring *r, void *ptr)
>> {
>>
>> int producer, next_producer;
>>
>>
>> do {
>> producer = READ_ONCE(r->producer);
>> if (unlikely(!r->size) || r->queue[producer])
>> return -ENOSPC;
>> next_producer = producer + 1;
>> if (unlikely(next_producer >= r->size))
>> next_producer = 0;
>> } while(cmpxchg_relaxed(&r->producer, producer, next_producer) != producer);
>>
>> /* Make sure the pointer we are storing points to a valid data. */
>> /* Pairs with the dependency ordering in __ptr_ring_consume. */
>> smp_wmb();
>>
>> WRITE_ONCE(r->queue[producer], ptr);
>> return 0;
>> }
>>
>> 3. Maybe it is possible to remove the netif_tx_lock for lockless qdisc
>> too, because dev_hard_start_xmit is also in the protection of
>> qdisc_run_begin()/qdisc_run_end()(if there is only one qdisc using
>> a netdev queue, which is true for pfifo_fast, I believe).
>>
>> 4. Remove the qdisc->running seqcount operation for lockless qdisc, which
>> is mainly used to do heuristic locking on q->busylock for locked qdisc.
>>
>
> Sounds good. They can stand two months, cant they?
>>>
>>>> Is there any reason why you want to revert it?
>>>>
>>> I think you know Jiri's plan and it would be nice to wait a couple of
>>> months for it to complete.
>>
>> I am not sure I am aware of Jiri's plan.
>> Is there any link referring to the plan?
>>
> https://lore.kernel.org/lkml/eaff25bc-9b64-037e-b9bc-c06fc4a5a9fb@xxxxxxxxxx/

I think there is some misunderstanding here.

As my understanding, Jiri and Juergen are from the same team(using
the suse.com mail server).
what Jiri said about "I am still planning to have Yunsheng Lin's
(CCing) fix [1] tested in the coming days." is that Juergen has
done the test and provide a "Tested-by" tag.

So if this patch fixes the packet stuck problem, Jiri is ok with
NOLOCK qdisc too.

Or do I misunderstand it again? Perhaps Jiri and Juergen can help to
clarify this?


>
> .
>