Re: [net,v6,1/1] net/sched: act_gate: snapshot parameters with RCU on replace

From: Paul Moses

Date: Fri Feb 13 2026 - 12:38:13 EST


1. I don’t think this is reachable, during create, the IDR slot is populated
with ERR_PTR(-EBUSY) by tcf_idr_check_alloc(). The real action pointer is
only published later by tcf_idr_insert_many(), which runs after init()
returns and after tcf_gate_init() has already done:

rcu_replace_pointer(gact->param, p, ...)

Both the normal lookup path and the dump walker treat ERR_PTR entries as
“not ready”: tcf_idr_search() rejects them and the dump walker skips.

2. offload_act_setup() is currently called with act->tcfa_lock held in both
call chains (net/sched/act_api.c and net/sched/cls_api.c). Since
gact->tcf_lock aliases common.tcfa_lock, the
lockdep_is_held(&gact->tcf_lock) condition in tcf_gate_params_locked()
is satisfied.

Thanks,
Paul




On Friday, February 13th, 2026 at 9:55 AM, Simon Horman <horms@xxxxxxxxxx> wrote:

> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
>
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> net/sched: act_gate: snapshot parameters with RCU on replace
>
> The gate action can be replaced while the hrtimer callback or dump path is
> walking the schedule list.
>
> Convert the parameters to an RCU-protected snapshot and swap updates under
> tcf_lock, freeing the previous snapshot via call_rcu(). When REPLACE omits
> the entry list, preserve the existing schedule so the effective state is
> unchanged.
>
> > diff --git a/include/net/tc_act/tc_gate.h b/include/net/tc_act/tc_gate.h
> > index c1a67149c6b6..5223c00279d5 100644
> > --- a/include/net/tc_act/tc_gate.h
> > +++ b/include/net/tc_act/tc_gate.h
> > @@ -32,6 +32,7 @@ struct tcf_gate_params {
> > s32 tcfg_clockid;
> > size_t num_entries;
> > struct list_head entries;
> > + struct rcu_head rcu;
> > };
> >
> > #define GATE_ACT_GATE_OPEN BIT(0)
> > @@ -39,7 +40,7 @@ struct tcf_gate_params {
> >
> > struct tcf_gate {
> > struct tc_action common;
> > - struct tcf_gate_params param;
> > + struct tcf_gate_params __rcu *param;
> > u8 current_gate_status;
> > ktime_t current_close_time;
> > u32 current_entry_octets;
> > @@ -51,47 +52,65 @@ struct tcf_gate {
> >
> > #define to_gate(a) ((struct tcf_gate *)a)
> >
> > +static inline struct tcf_gate_params *tcf_gate_params_locked(const struct tc_action *a)
> > +{
> > + struct tcf_gate *gact = to_gate(a);
> > +
> > + return rcu_dereference_protected(gact->param,
> > + lockdep_is_held(&gact->tcf_lock));
> > +}
> > +
> > static inline s32 tcf_gate_prio(const struct tc_action *a)
> > {
> > + struct tcf_gate_params *p;
> > s32 tcfg_prio;
> >
> > - tcfg_prio = to_gate(a)->param.tcfg_priority;
> > + p = tcf_gate_params_locked(a);
> > + tcfg_prio = p->tcfg_priority;
> >
> > return tcfg_prio;
> > }
>
> [ ... ]
>
> > diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c
> > index c1f75f272757..60c80e609ec3 100644
> > --- a/net/sched/act_gate.c
> > +++ b/net/sched/act_gate.c
>
> [ ... ]
>
> > @@ -509,10 +621,9 @@ static int tcf_gate_dump(struct sk_buff *skb, struct tc_action *a,
> > struct nlattr *entry_list;
> > struct tcf_t t;
> >
> > - spin_lock_bh(&gact->tcf_lock);
> > - opt.action = gact->tcf_action;
> > -
> > - p = &gact->param;
> > + rcu_read_lock();
> > + opt.action = READ_ONCE(gact->tcf_action);
> > + p = rcu_dereference(gact->param);
> >
> > if (nla_put(skb, TCA_GATE_PARMS, sizeof(opt), &opt))
> > goto nla_put_failure;
> >
> > @@ -552,12 +663,12 @@ static int tcf_gate_dump(struct sk_buff *skb, struct tc_action *a,
> > tcf_tm_dump(&t, &gact->tcf_tm);
> > if (nla_put_64bit(skb, TCA_GATE_TM, sizeof(t), &t, TCA_GATE_PAD))
> > goto nla_put_failure;
> > - spin_unlock_bh(&gact->tcf_lock);
> > + rcu_read_unlock();
> >
> > return skb->len;
> >
> > nla_put_failure:
> > - spin_unlock_bh(&gact->tcf_lock);
> > + rcu_read_unlock();
> > nlmsg_trim(skb, b);
> > return -1;
> > }
>
> Can tcf_gate_dump be called before gact->param is initialized?
>
> After the conversion to RCU pointers, if tcf_gate_dump is called during
> action initialization (between tcf_idr_create_from_flags and
> rcu_replace_pointer in tcf_gate_init), p will be NULL:
>
> net/sched/act_gate.c:tcf_gate_init() {
> tcf_idr_create_from_flags(...);
> ...
> // gact->param still NULL here
> ...
> rcu_replace_pointer(gact->param, p, ...); // line 517
> }
>
> net/sched/act_gate.c:tcf_gate_dump() {
> rcu_read_lock();
> p = rcu_dereference(gact->param); // can be NULL
>
> if (nla_put_u64_64bit(skb, TCA_GATE_BASE_TIME,
> p->tcfg_basetime, TCA_GATE_PAD)) // NULL deref
> }
>
> Before the patch, param was an embedded struct that was zero-initialized,
> so dump would return valid (if meaningless) data. After the patch, this
> becomes a NULL pointer dereference.
>
> While the tc action framework may prevent dumping partially initialized
> actions through external synchronization, should this code include a NULL
> check to make the initialization window safer?
>
> The second issue is in the hardware offload path. Looking at
> net/sched/act_gate.c:tcf_gate_offload_act_setup(), the function calls the
> accessor helpers without holding tcf_lock:
>
> net/sched/act_gate.c:tcf_gate_offload_act_setup() {
> // no lock held here
> entry->gate.prio = tcf_gate_prio(act); // line 723
> entry->gate.basetime = tcf_gate_basetime(act);
> entry->gate.cycletime = tcf_gate_cycletime(act);
> ...
> }
>
> include/net/tc_act/tc_gate.h:tcf_gate_prio() {
> p = tcf_gate_params_locked(a); // uses rcu_dereference_protected
> ...
> }
>
> include/net/tc_act/tc_gate.h:tcf_gate_params_locked() {
> return rcu_dereference_protected(gact->param,
> lockdep_is_held(&gact->tcf_lock));
> }
>
> Does tcf_gate_offload_act_setup violate the locking requirements?
>
> The accessor functions all use tcf_gate_params_locked(), which requires
> tcf_lock to be held (verified via lockdep_is_held). The offload setup
> function doesn't acquire this lock before calling the accessors.
>
> With lockdep enabled, this will trigger warnings. Without lockdep, there's
> a race where param can be replaced via rcu_replace_pointer while the
> offload function is reading it, potentially causing reads of inconsistent
> state.
>
> The dump path uses the correct pattern with rcu_read_lock() and
> rcu_dereference(). Should the offload path either acquire tcf_lock or use
> a similar RCU-only approach?
>