Re: [PATCH net] ipv6: fix data race in fib6_metric_set() using cmpxchg
From: Jiayuan Chen
Date: Thu Mar 26 2026 - 04:04:13 EST
On 3/26/26 3:13 PM, Hangbin Liu wrote:
On Thu, Mar 26, 2026 at 06:44:43AM +0000, Hangbin Liu wrote:If concurrent writes to the same memory by two CPUs are considered a bug,
On Thu, Mar 26, 2026 at 02:23:15PM +0800, Jiayuan Chen wrote:BTW, do we really need to WRITE_ONCE here? What if the `val` are different
On 3/26/26 12:22 PM, Hangbin Liu wrote:Thanks, I will update this in next version.
fib6_metric_set() may be called concurrently from softirq context without
holding the FIB table lock. A typical path is:
ndisc_router_discovery()
spin_unlock_bh(&table->tb6_lock) <- lock released
fib6_metric_set(rt, RTAX_HOPLIMIT, ...) <- lockless call
When two CPUs process Router Advertisement packets for the same router
simultaneously, they can both arrive at fib6_metric_set() with the same
fib6_info pointer whose fib6_metrics still points to dst_default_metrics.
if (f6i->fib6_metrics == &dst_default_metrics) { /* both CPUs: true */
struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
refcount_set(&p->refcnt, 1);
f6i->fib6_metrics = p; /* CPU1 overwrites CPU0's p -> p0 leaked */
}
The dst_metrics allocated by the losing CPU has refcnt=1 but no pointer
to it anywhere in memory, producing a kmemleak report:
unreferenced object 0xff1100025aca1400 (size 96):
comm "softirq", pid 0, jiffies 4299271239
backtrace:
kmalloc_trace+0x28a/0x380
fib6_metric_set+0xcd/0x180
ndisc_router_discovery+0x12dc/0x24b0
icmpv6_rcv+0xc16/0x1360
Fix this by replacing the plain pointer store with cmpxchg() and free
the allocation safely when competition failed.
Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
Reported-by: Fei Liu <feliu@xxxxxxxxxx>
Signed-off-by: Hangbin Liu <liuhangbin@xxxxxxxxx>
---
net/ipv6/ip6_fib.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index dd26657b6a4a..64de761f40d5 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -730,14 +730,16 @@ void fib6_metric_set(struct fib6_info *f6i, int metric, u32 val)
if (!f6i)
return;
- if (f6i->fib6_metrics == &dst_default_metrics) {
+ if (READ_ONCE(f6i->fib6_metrics) == &dst_default_metrics) {
+ struct dst_metrics *dflt = (struct dst_metrics *)&dst_default_metrics;
struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
if (!p)
return;
refcount_set(&p->refcnt, 1);
- f6i->fib6_metrics = p;
+ if (cmpxchg(&f6i->fib6_metrics, dflt, p) != dflt)
+ kfree(p);
}
[...]
f6i->fib6_metrics->metrics[metric - 1] = val;Suggest using marked accessors to suppress KCSAN warnings:
struct dst_metrics *m = READ_ONCE(f6i->fib6_metrics);
WRITE_ONCE(m->metrics[metric - 1], val);
on 2 CPUs? This would hide the problem, right?
Thanks
Hangbin
the solution is prevention, not detection by KCSAN. Furthermore, WRITE_ONCE
also helps suppress warnings from read-write races. For example, commit
6e84fc395e90 demonstrates this.