6.10/bisected/regression - commits 4e7aaa6b82d6 cause appearing WARNING at net/netfilter/ipset/ip_set_core.c:1200 suspicious rcu_dereference_protected() usage

From: Mikhail Gavrilov
Date: Thu Jun 20 2024 - 02:55:36 EST


Hi,
between 2ef5971ff345 and rc4 I spotted a new regression.
It is expressed in the appearance warning with stacktrace after one
minute after boot.

=============================
WARNING: suspicious RCU usage
6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug #1 Tainted:
G W L ------- ---
-----------------------------
net/netfilter/ipset/ip_set_core.c:1200 suspicious
rcu_dereference_protected() usage!

other info that might help us debug this:


rcu_scheduler_active = 2, debug_locks = 1
3 locks held by kworker/u128:1/264:
#0: ffff88810813c958 ((wq_completion)netns){+.+.}-{0:0}, at:
process_one_work+0xeab/0x1460
#1: ffffc90001477da0 (net_cleanup_work){+.+.}-{0:0}, at:
process_one_work+0x82b/0x1460
#2: ffffffff97d9ae98 (pernet_ops_rwsem){++++}-{3:3}, at:
cleanup_net+0xb9/0xa90

stack backtrace:
CPU: 30 PID: 264 Comm: kworker/u128:1 Tainted: G W L
------- --- 6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug
#1
Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING
WIFI, BIOS 2611 04/07/2024
Workqueue: netns cleanup_net
Call Trace:
<TASK>
dump_stack_lvl+0x84/0xd0
lockdep_rcu_suspicious.cold+0xa1/0x134
_destroy_all_sets+0x1c7/0x560 [ip_set]
ip_set_net_exit+0x20/0x50 [ip_set]
ops_exit_list+0x99/0x170
cleanup_net+0x4d9/0xa90
? __pfx_cleanup_net+0x10/0x10
process_one_work+0x8a4/0x1460
? worker_thread+0xe3/0x1010
? __pfx_process_one_work+0x10/0x10
? assign_work+0x16c/0x240
worker_thread+0x5e6/0x1010
? __kthread_parkme+0xb1/0x1d0
? __pfx_worker_thread+0x10/0x10
? __pfx_worker_thread+0x10/0x10
kthread+0x2d2/0x3a0
? _raw_spin_unlock_irq+0x28/0x60
? __pfx_kthread+0x10/0x10
ret_from_fork+0x31/0x70
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>

=============================
WARNING: suspicious RCU usage
6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug #1 Tainted:
G W L ------- ---
-----------------------------
net/netfilter/ipset/ip_set_core.c:1211 suspicious
rcu_dereference_protected() usage!

other info that might help us debug this:


rcu_scheduler_active = 2, debug_locks = 1
3 locks held by kworker/u128:1/264:
#0: ffff88810813c958 ((wq_completion)netns){+.+.}-{0:0}, at:
process_one_work+0xeab/0x1460
#1: ffffc90001477da0 (net_cleanup_work){+.+.}-{0:0}, at:
process_one_work+0x82b/0x1460
#2: ffffffff97d9ae98 (pernet_ops_rwsem){++++}-{3:3}, at:
cleanup_net+0xb9/0xa90

stack backtrace:
CPU: 30 PID: 264 Comm: kworker/u128:1 Tainted: G W L
------- --- 6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug
#1
Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING
WIFI, BIOS 2611 04/07/2024
Workqueue: netns cleanup_net
Call Trace:
<TASK>
dump_stack_lvl+0x84/0xd0
lockdep_rcu_suspicious.cold+0xa1/0x134
_destroy_all_sets+0x3a8/0x560 [ip_set]
ip_set_net_exit+0x20/0x50 [ip_set]
ops_exit_list+0x99/0x170
cleanup_net+0x4d9/0xa90
? __pfx_cleanup_net+0x10/0x10
process_one_work+0x8a4/0x1460
? worker_thread+0xe3/0x1010
? __pfx_process_one_work+0x10/0x10
? assign_work+0x16c/0x240
worker_thread+0x5e6/0x1010
? __kthread_parkme+0xb1/0x1d0
? __pfx_worker_thread+0x10/0x10
? __pfx_worker_thread+0x10/0x10
kthread+0x2d2/0x3a0
? _raw_spin_unlock_irq+0x28/0x60
? __pfx_kthread+0x10/0x10
ret_from_fork+0x31/0x70
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
workqueue: gc_worker [nf_conntrack] hogged CPU for >10000us 7 times,
consider switching to WQ_UNBOUND
workqueue: gc_worker [nf_conntrack] hogged CPU for >10000us 11 times,
consider switching to WQ_UNBOUND

Bisect blame this commit:
commit 4e7aaa6b82d63e8ddcbfb56b4fd3d014ca586f10
Author: Jozsef Kadlecsik <kadlec@xxxxxxxxxxxxx>
Date: Tue Jun 4 15:58:03 2024 +0200

netfilter: ipset: Fix race between namespace cleanup and gc in the
list:set type

Lion Ackermann reported that there is a race condition between
namespace cleanup
in ipset and the garbage collection of the list:set type. The namespace
cleanup can destroy the list:set type of sets while the gc of the
set type is
waiting to run in rcu cleanup. The latter uses data from the
destroyed set which
thus leads use after free. The patch contains the following parts:

- When destroying all sets, first remove the garbage collectors, then wait
if needed and then destroy the sets.
- Fix the badly ordered "wait then remove gc" for the destroy a single set
case.
- Fix the missing rcu locking in the list:set type in the userspace test
case.
- Use proper RCU list handlings in the list:set type.

The patch depends on c1193d9bbbd3 (netfilter: ipset: Add list
flush to cancel_gc).

Fixes: 97f7cf1cd80e (netfilter: ipset: fix performance regression
in swap operation)
Reported-by: Lion Ackermann <nnamrec@xxxxxxxxx>
Tested-by: Lion Ackermann <nnamrec@xxxxxxxxx>
Signed-off-by: Jozsef Kadlecsik <kadlec@xxxxxxxxxxxxx>
Signed-off-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>

net/netfilter/ipset/ip_set_core.c | 81
++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------
net/netfilter/ipset/ip_set_list_set.c | 30 ++++++++++++++----------------
2 files changed, 60 insertions(+), 51 deletions(-)

And I can confirm after reverting 4e7aaa6b82d6 the issue is gone.

I also attach the build config and full kernel log.

My hardware specs: https://linux-hardware.org/?probe=80512f0c04

Jozsef can you look into this please?

--
Best Regards,
Mike Gavrilov.

Attachment: dmesg-6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug.zip
Description: Zip archive

Attachment: .config.zip
Description: Zip archive