[PATCH net v2 1/1] net/sched: act_api: use RCU with deferred freeing for action lifecycle

From: Jamal Hadi Salim

Date: Sun May 31 2026 - 12:09:01 EST


When NEWTFILTER and DELFILTER are run concurrently it is possible to create a
race with an associated action.

Let's illustrate with CPU0 running NEWTFILTER and CPU1 running DELFILTER:

0: mutex_lock() <-- holds the idr lock
0: rcu_read_lock()
0: p = idr_find(idr, index) <-- action p is valid (RCU protects IDR)
0: mutex_unlock() <-- releases the idr lock
1: refcount_dec_and_mutex_lock() <-- refcnt 1->0, mutex held
1: idr_remove(idr, index) <-- Action removed from IDR
1: mutex_unlock() <-- mutex released allowing us to delete the action
1: tcf_action_cleanup(p); kfree(p) <-- Kfrees p immediately, no deferral
0: refcount_inc_not_zero(&p->tcfa_refcnt) <-- ouch, UAF p points to freed memory

This patch fixes the race condition between NEWTFILTER and DELFILTER by
adding struct rcu_head to tc_action used in the deferral and introducing a
call_rcu() in the delete path to defer the final kfree().

Note: this is a revert of commit d7fb60b9cafb ("net_sched: get rid of tcfa_rcu")
but also modernization/simplification to directly use kfree_rcu().

Let's illustrate the new restored code path:

0: rcu_read_lock()
1: refcount_dec_and_mutex_lock() <-- refcnt 1->0, mutex held
1: idr_remove(idr, index)
1: mutex_unlock()
1: call_rcu(&p->tcfa_rcu, tcf_action_rcu_free) <-- defer kfree after grace period
0: p = idr_find(idr, index)
0: refcount_inc_not_zero(&p->tcfa_refcnt) <-- fails, refcnt already 0
1: rcu_read_unlock() <-- release so freeing can run after grace period

After CPU1 calls idr_remove(), the object is no longer reachable through the IDR.
CPU0's subsequent idr_find() will return NULL, and even if it still held a
stale pointer, the immediate kfree() is now deferred until after the RCU grace
period, so no UAF can occur.

Fixes: d7fb60b9cafb ("net_sched: get rid of tcfa_rcu")
Suggested-by: Jakub Kicinski <kuba@xxxxxxxxxx>
Reported-by: Kyle Zeng <kylebot@xxxxxxxxxx>
Tested-by: Victor Nogueira <victor@xxxxxxxxxxxx>
Tested-by: syzbot@xxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Jamal Hadi Salim <jhs@xxxxxxxxxxxx>
---
v1->v2
1) syzbot ci found revealed that the cleanup code could be called under
softirq/atomic context. Because tcf_action_cleanup() calls __tcf_chain_put()
which grabs a mutex (which can sleep). Fix is to move tcf_action_cleanup()
to be invoked before call to rcu.
2) And while looking closely i noticed a bad comment on action code which
claims that "actions are always connected to filters" ;-> Tracing back
on git i noticed that infact we did have exactly this new change circa 2017
that was erronously removed by commit d7fb60b9cafb. Well, at least I
am certain what the "Fixes" tag should be now ;->
3) Patch is much much simpler now than it was in v1
---
include/net/act_api.h | 1 +
net/sched/act_api.c | 7 +------
2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index d11b79107930..fd2967ee08f7 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -45,6 +45,7 @@ struct tc_action {
struct tc_cookie __rcu *user_cookie;
struct tcf_chain __rcu *goto_chain;
u32 tcfa_flags;
+ struct rcu_head tcfa_rcu;
u8 hw_stats;
u8 used_hw_stats;
bool used_hw_stats_valid;
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 332fd9695e54..04ea11c90e03 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -112,11 +112,6 @@ struct tcf_chain *tcf_action_set_ctrlact(struct tc_action *a, int action,
}
EXPORT_SYMBOL(tcf_action_set_ctrlact);

-/* XXX: For standalone actions, we don't need a RCU grace period either, because
- * actions are always connected to filters and filters are already destroyed in
- * RCU callbacks, so after a RCU grace period actions are already disconnected
- * from filters. Readers later can not find us.
- */
static void free_tcf(struct tc_action *p)
{
struct tcf_chain *chain = rcu_dereference_protected(p->goto_chain, 1);
@@ -129,7 +124,7 @@ static void free_tcf(struct tc_action *p)
if (chain)
tcf_chain_put_by_act(chain);

- kfree(p);
+ kfree_rcu(p, tcfa_rcu);
}

static void offload_action_hw_count_set(struct tc_action *act,
--
2.34.1