Re: [PATCH net v2 1/1] net/sched: act_api: use RCU with deferred freeing for action lifecycle

From: Kyle Zeng

Date: Sun May 31 2026 - 15:15:01 EST


On Sun, May 31, 2026 at 12:08:12PM -0400, Jamal Hadi Salim wrote:
> When NEWTFILTER and DELFILTER are run concurrently it is possible to create a
> race with an associated action.
>
> Let's illustrate with CPU0 running NEWTFILTER and CPU1 running DELFILTER:
>
> 0: mutex_lock() <-- holds the idr lock
> 0: rcu_read_lock()
> 0: p = idr_find(idr, index) <-- action p is valid (RCU protects IDR)
> 0: mutex_unlock() <-- releases the idr lock
> 1: refcount_dec_and_mutex_lock() <-- refcnt 1->0, mutex held
> 1: idr_remove(idr, index) <-- Action removed from IDR
> 1: mutex_unlock() <-- mutex released allowing us to delete the action
> 1: tcf_action_cleanup(p); kfree(p) <-- Kfrees p immediately, no deferral
> 0: refcount_inc_not_zero(&p->tcfa_refcnt) <-- ouch, UAF p points to freed memory
>
> This patch fixes the race condition between NEWTFILTER and DELFILTER by
> adding struct rcu_head to tc_action used in the deferral and introducing a
> call_rcu() in the delete path to defer the final kfree().
>
> Note: this is a revert of commit d7fb60b9cafb ("net_sched: get rid of tcfa_rcu")
> but also modernization/simplification to directly use kfree_rcu().
>
> Let's illustrate the new restored code path:
>
> 0: rcu_read_lock()
> 1: refcount_dec_and_mutex_lock() <-- refcnt 1->0, mutex held
> 1: idr_remove(idr, index)
> 1: mutex_unlock()
> 1: call_rcu(&p->tcfa_rcu, tcf_action_rcu_free) <-- defer kfree after grace period
> 0: p = idr_find(idr, index)
> 0: refcount_inc_not_zero(&p->tcfa_refcnt) <-- fails, refcnt already 0
> 1: rcu_read_unlock() <-- release so freeing can run after grace period
>
> After CPU1 calls idr_remove(), the object is no longer reachable through the IDR.
> CPU0's subsequent idr_find() will return NULL, and even if it still held a
> stale pointer, the immediate kfree() is now deferred until after the RCU grace
> period, so no UAF can occur.
>
> Fixes: d7fb60b9cafb ("net_sched: get rid of tcfa_rcu")
> Suggested-by: Jakub Kicinski <kuba@xxxxxxxxxx>
> Reported-by: Kyle Zeng <kylebot@xxxxxxxxxx>
> Tested-by: Victor Nogueira <victor@xxxxxxxxxxxx>
> Tested-by: syzbot@xxxxxxxxxxxxxxxxxxxxxxxxx
> Signed-off-by: Jamal Hadi Salim <jhs@xxxxxxxxxxxx>

Tested-by: Kyle Zeng <kylebot@xxxxxxxxxx>

Best,
Kyle