Re: [PATCH net-next v3] netdevsim: call napi_schedule from a timer context

From: Breno Leitao
Date: Thu Feb 20 2025 - 06:28:14 EST


On Wed, Feb 19, 2025 at 08:41:20AM -0800, Breno Leitao wrote:
> The netdevsim driver was experiencing NOHZ tick-stop errors during packet
> transmission due to pending softirq work when calling napi_schedule().
> This issue was observed when running the netconsole selftest, which
> triggered the following error message:
>
> NOHZ tick-stop error: local softirq work is pending, handler #08!!!
>
> To fix this issue, introduce a timer that schedules napi_schedule()
> from a timer context instead of calling it directly from the TX path.
>
> Create an hrtimer for each queue and kick it from the TX path,
> which then schedules napi_schedule() from the timer context.
>
> Suggested-by: Jakub Kicinski <kuba@xxxxxxxxxx>
> Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
> ---

Looking at the tests, 3 of them are failing:

https://netdev.bots.linux.dev/flakes.html


2/3 passed when retried and just one of them (ip6gre-custom-multipath-hash-sh) failed
also on the retry.

Looking at the flakes, I see that ip6gre-custom-multipath-hash-sh was
flake during yesterday:

https://netdev.bots.linux.dev/flakes.html?min-flip=0&tn-needle=ip6gre-custom-multipath-hash-sh

I've testd manually it, and the tests is passing:

# vng -v --run . --user root --cpus 4 --
make -C tools/testing/selftests TARGETS=net/forwarding TEST_PROGS=ip6gre_custom_multipath_hash.sh TEST_GEN_PROGS="" run_tests

...

ok 1 selftests: net/forwarding: ip6gre_custom_multipath_hash.sh


So, from a NIPA testing perspective, it seems the patch is good