Re: [PATCH] selftests: net: ip_defrag: increase netdev_max_backlog
From: Eric Dumazet
Date: Fri Dec 06 2019 - 08:41:06 EST
On 12/6/19 4:17 AM, Thadeu Lima de Souza Cascardo wrote:
> On Wed, Dec 04, 2019 at 12:03:57PM -0800, Eric Dumazet wrote:
>>
>>
>> On 12/4/19 11:53 AM, Thadeu Lima de Souza Cascardo wrote:
>>> When using fragments with size 8 and payload larger than 8000, the backlog
>>> might fill up and packets will be dropped, causing the test to fail. This
>>> happens often enough when conntrack is on during the IPv6 test.
>>>
>>> As the larger payload in the test is 10000, using a backlog of 1250 allow
>>> the test to run repeatedly without failure. At least a 1000 runs were
>>> possible with no failures, when usually less than 50 runs were good enough
>>> for showing a failure.
>>>
>>> As netdev_max_backlog is not a pernet setting, this sets the backlog to
>>> 1000 during exit to prevent disturbing following tests.
>>>
>>
>> Hmmm... I would prefer not changing a global setting like that.
>> This is going to be flaky since we often run tests in parallel (using different netns)
>>
>> What about adding a small delay after each sent packet ?
>>
>> diff --git a/tools/testing/selftests/net/ip_defrag.c b/tools/testing/selftests/net/ip_defrag.c
>> index c0c9ecb891e1d78585e0db95fd8783be31bc563a..24d0723d2e7e9b94c3e365ee2ee30e9445deafa8 100644
>> --- a/tools/testing/selftests/net/ip_defrag.c
>> +++ b/tools/testing/selftests/net/ip_defrag.c
>> @@ -198,6 +198,7 @@ static void send_fragment(int fd_raw, struct sockaddr *addr, socklen_t alen,
>> error(1, 0, "send_fragment: %d vs %d", res, frag_len);
>>
>> frag_counter++;
>> + usleep(1000);
>> }
>>
>> static void send_udp_frags(int fd_raw, struct sockaddr *addr,
>>
>
> That won't work because the issue only shows when we using conntrack, as the
> packet will be reassembled on output, then fragmented again. When this happens,
> the fragmentation code is transmitting the fragments in a tight loop, which
> floods the backlog.
Interesting !
So it looks like the test is correct, and exposed a long standing problem in this code.
We should not adjust the test to some kernel-of-the-day-constraints, and instead fix the kernel bug ;)
Where is this tight loop exactly ?
If this is feeding/bursting ~1000 skbs via netif_rx() in a BH context, maybe we need to call a variant
that allows immediate processing instead of (ab)using the softnet backlog.
Thanks !