Re: [PATCH net v4 0/5] Lock RCU before calling ip6mr_get_table()

From: Stefan Wiehler
Date: Mon Oct 14 2024 - 11:04:44 EST


> Hi Stefan. I think a v5 is needed :)
>
> Please double check your syslog
>
> [ 18.149447] =============================
> [ 18.149471] WARNING: suspicious RCU usage
> [ 18.149649] 6.12.0-rc2-virtme #1155 Not tainted
> [ 18.149726] -----------------------------
> [ 18.149747] net/ipv6/ip6mr.c:131 RCU-list traversed in non-reader section!!
> [ 18.149792]
> other info that might help us debug this:
>
> [ 18.149824]
> rcu_scheduler_active = 2, debug_locks = 1
> [ 18.150050] 1 lock held by swapper/0/1:
> [ 18.150090] #0: ffffffff95b36390 (pernet_ops_rwsem){+.+.}-{3:3},
> at: register_pernet_subsys (net/core/net_namespace.c:1356)
> [ 18.151482]
> stack backtrace:
> [ 18.151716] CPU: 12 UID: 0 PID: 1 Comm: swapper/0 Not tainted
> 6.12.0-rc2-virtme #1155
> [ 18.151809] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 18.151982] Call Trace:
> [ 18.152122] <TASK>
> [ 18.152411] dump_stack_lvl (lib/dump_stack.c:123)
> [ 18.152411] lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822)
> [ 18.152411] ip6mr_get_table (net/ipv6/ip6mr.c:131 (discriminator 9))
> [ 18.152411] ip6mr_net_init (net/ipv6/ip6mr.c:384
> net/ipv6/ip6mr.c:238 net/ipv6/ip6mr.c:1317 net/ipv6/ip6mr.c:1309)
> [ 18.152411] ops_init (net/core/net_namespace.c:139)
> [ 18.152411] register_pernet_operations
> (net/core/net_namespace.c:1239 net/core/net_namespace.c:1315)
> [ 18.152411] register_pernet_subsys (net/core/net_namespace.c:1357)
> [ 18.152411] ip6_mr_init (net/ipv6/ip6mr.c:1379)
> [ 18.152411] inet6_init (net/ipv6/af_inet6.c:1137)
> [ 18.152411] ? __pfx_inet6_init (net/ipv6/af_inet6.c:1076)
> [ 18.152411] do_one_initcall (init/main.c:1269)
> [ 18.152411] ? _raw_spin_unlock_irqrestore
> (./arch/x86/include/asm/irqflags.h:42
> ./arch/x86/include/asm/irqflags.h:97
> ./arch/x86/include/asm/irqflags.h:155
> ./include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194)
> [ 18.152411] kernel_init_freeable (init/main.c:1330 (discriminator
> 1) init/main.c:1347 (discriminator 1) init/main.c:1366 (discriminator
> 1) init/main.c:1580 (discriminator 1))
> [ 18.152411] ? __pfx_kernel_init (init/main.c:1461)
> [ 18.152411] kernel_init (init/main.c:1471)
> [ 18.152411] ret_from_fork (arch/x86/kernel/process.c:153)
> [ 18.152411] ? __pfx_kernel_init (init/main.c:1461)
> [ 18.152411] ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
> [ 18.152411] </TASK>

Thanks, I'm not sure why I missed that one since it also shows up on our v6.1-based kernel.

I went through all the remaining calls of ip6mr_get_table() in ip6mr.c (since
I've picked this topic up from a colleague):

- call in ip6mr_rule_action is safe because fib_rules_lookup() holds RCU lock
- call in ipmr_mfc_seq_start() needs to be in RCU read-side critical section as well
- calls in ip6mr_rtm_(set|get)sockopt() need to be in RCU read-side critical section as well
- call in ip6mr_rtm_getroute() needs to hold RCU read lock earlier as well
- call in ip6mr_rtm_dumproute() is safe because rtnl_register_internal() holds the RTNL lock

I'll prepare a v5; please carefully review as I'm not familar with the codebase...