Re: [PATCH net-next] net: use in_softirq() to indicate the NAPI context in napi_consume_skb()

From: Eric Dumazet
Date: Mon Sep 21 2020 - 05:09:20 EST




On 9/21/20 10:40 AM, Yunsheng Lin wrote:
> On 2020/9/21 16:17, Eric Dumazet wrote:
>> On Mon, Sep 21, 2020 at 10:10 AM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote:
>>>
>>> On 2020/9/21 15:19, Eric Dumazet wrote:
>>>> On Mon, Sep 21, 2020 at 4:08 AM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote:
>>>>>
>>>>> When napi_consume_skb() is called in the tx desc cleaning process,
>>>>> it is usually in the softirq context(BH disabled, or are processing
>>>>> softirqs), but it may also be in the task context, such as in the
>>>>> netpoll or loopback selftest process.
>>>>>
>>>>> Currently napi_consume_skb() uses non-zero budget to indicate the
>>>>> NAPI context, the driver writer may provide the wrong budget when
>>>>> tx desc cleaning function is reused for both NAPI and non-NAPI
>>>>> context, see [1].
>>>>>
>>>>> So this patch uses in_softirq() to indicate the NAPI context, which
>>>>> doesn't necessarily mean in NAPI context, but it shouldn't care if
>>>>> NAPI context or not as long as it runs in softirq context or with BH
>>>>> disabled, then _kfree_skb_defer() will push the skb to the particular
>>>>> cpu' napi_alloc_cache atomically.
>>>>>
>>>>> [1] https://lkml.org/lkml/2020/9/15/38
>>>>>
>>>>> Signed-off-by: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
>>>>> ---
>>>>> note that budget parameter is not removed in this patch because it
>>>>> involves many driver changes, we can remove it in separate patch if
>>>>> this patch is accepted.
>>>>> ---
>>>>> net/core/skbuff.c | 6 ++++--
>>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>>>>> index e077447..03d0d28 100644
>>>>> --- a/net/core/skbuff.c
>>>>> +++ b/net/core/skbuff.c
>>>>> @@ -895,8 +895,10 @@ void __kfree_skb_defer(struct sk_buff *skb)
>>>>>
>>>>> void napi_consume_skb(struct sk_buff *skb, int budget)
>>>>> {
>>>>> - /* Zero budget indicate non-NAPI context called us, like netpoll */
>>>>> - if (unlikely(!budget)) {
>>>>> + /* called by non-softirq context, which usually means non-NAPI
>>>>> + * context, like netpoll.
>>>>> + */
>>>>> + if (unlikely(!in_softirq())) {
>>>>> dev_consume_skb_any(skb);
>>>>> return;
>>>>> }
>>>>> --
>>>>
>>>>
>>>> I do not think we should add this kind of fuzzy logic, just because
>>>> _one_ driver author made a mistake.
>>>>
>>>> Add a disable_bh() in the driver slow path, and accept the _existing_
>>>> semantic, the one that was understood by dozens.
>>>
>>> As my understanding, this patch did not change _existing_ semantic,
>>> it still only call _kfree_skb_defer() in softirq context. This patch
>>> just remove the requirement that a softirq context hint need to be
>>> provided to decide whether calling _kfree_skb_defer().
>>
>> I do not want to remove the requirement.
>>
>>>
>>> Yes, we can add DEBUG_NET() clauses to catch this kind of error as
>>> you suggested.
>>>
>>> But why we need such a debug clauses, when we can decide if delaying
>>> skb freeing is possible in napi_consume_skb(), why not just use
>>> in_softirq() to make this API more easy to use? Just as __dev_kfree_skb_any()
>>> API use "in_irq() || irqs_disabled()" checking to handle the irq context
>>> and non-irq context.
>>
>>
>> I just do not like your patch.
>>
>> Copying another piece of fuzzy logic, inherited from legacy code is
>> not an excuse.
>>
>> Add a local_bh_disable() in the driver slow path to meet _existing_
>> requirement, so that we can keep the hot path fast.
>
> "!in_softirq()" checking make the napi_consume_skb() slower than
> "!budget" checking? do I miss something?

Yes, you missed that we can _remove_ this condition completely, if drivers
make sure to always have BH disabled.

>
> As a matter of fact, the hns3 driver has fixed this problem by
> passing zero-budget to napi_consume_skb() in non-NAPI context, this
> patch is more about how to avoid or catch this kind of error.

I think I understood this. Having one error in hns3 does not mean we are going
to slow down the stack.

>
> So your opinion is still to catch this kind of error using something
> like DEBUG_NET() clauses?
>

Yes.

Again, we want to keep fast path fast, not something that is an swiss army knife.