Re: [net,v6,1/1] net/sched: act_gate: snapshot parameters with RCU on replace
From: Jamal Hadi Salim
Date: Wed Feb 18 2026 - 08:33:56 EST
On Fri, Feb 13, 2026 at 12:37 PM Paul Moses <p@xxxxxxx> wrote:
>
> 1. I don’t think this is reachable, during create, the IDR slot is populated
> with ERR_PTR(-EBUSY) by tcf_idr_check_alloc(). The real action pointer is
> only published later by tcf_idr_insert_many(), which runs after init()
> returns and after tcf_gate_init() has already done:
>
> rcu_replace_pointer(gact->param, p, ...)
>
> Both the normal lookup path and the dump walker treat ERR_PTR entries as
> “not ready”: tcf_idr_search() rejects them and the dump walker skips.
>
Correct.
> 2. offload_act_setup() is currently called with act->tcfa_lock held in both
> call chains (net/sched/act_api.c and net/sched/cls_api.c). Since
> gact->tcf_lock aliases common.tcfa_lock, the
> lockdep_is_held(&gact->tcf_lock) condition in tcf_gate_params_locked()
> is satisfied.
>
The spinlock will catch it as stated.
The AI is clearly hallucinating..
cheers,
jamal
> Thanks,
> Paul
>
>
>
>
> On Friday, February 13th, 2026 at 9:55 AM, Simon Horman <horms@xxxxxxxxxx> wrote:
>
> > This is an AI-generated review of your patch. The human sending this
> > email has considered the AI review valid, or at least plausible.
> >
> > For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> > ---
> > net/sched: act_gate: snapshot parameters with RCU on replace
> >
> > The gate action can be replaced while the hrtimer callback or dump path is
> > walking the schedule list.
> >
> > Convert the parameters to an RCU-protected snapshot and swap updates under
> > tcf_lock, freeing the previous snapshot via call_rcu(). When REPLACE omits
> > the entry list, preserve the existing schedule so the effective state is
> > unchanged.
> >
> > > diff --git a/include/net/tc_act/tc_gate.h b/include/net/tc_act/tc_gate.h
> > > index c1a67149c6b6..5223c00279d5 100644
> > > --- a/include/net/tc_act/tc_gate.h
> > > +++ b/include/net/tc_act/tc_gate.h
> > > @@ -32,6 +32,7 @@ struct tcf_gate_params {
> > > s32 tcfg_clockid;
> > > size_t num_entries;
> > > struct list_head entries;
> > > + struct rcu_head rcu;
> > > };
> > >
> > > #define GATE_ACT_GATE_OPEN BIT(0)
> > > @@ -39,7 +40,7 @@ struct tcf_gate_params {
> > >
> > > struct tcf_gate {
> > > struct tc_action common;
> > > - struct tcf_gate_params param;
> > > + struct tcf_gate_params __rcu *param;
> > > u8 current_gate_status;
> > > ktime_t current_close_time;
> > > u32 current_entry_octets;
> > > @@ -51,47 +52,65 @@ struct tcf_gate {
> > >
> > > #define to_gate(a) ((struct tcf_gate *)a)
> > >
> > > +static inline struct tcf_gate_params *tcf_gate_params_locked(const struct tc_action *a)
> > > +{
> > > + struct tcf_gate *gact = to_gate(a);
> > > +
> > > + return rcu_dereference_protected(gact->param,
> > > + lockdep_is_held(&gact->tcf_lock));
> > > +}
> > > +
> > > static inline s32 tcf_gate_prio(const struct tc_action *a)
> > > {
> > > + struct tcf_gate_params *p;
> > > s32 tcfg_prio;
> > >
> > > - tcfg_prio = to_gate(a)->param.tcfg_priority;
> > > + p = tcf_gate_params_locked(a);
> > > + tcfg_prio = p->tcfg_priority;
> > >
> > > return tcfg_prio;
> > > }
> >
> > [ ... ]
> >
> > > diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c
> > > index c1f75f272757..60c80e609ec3 100644
> > > --- a/net/sched/act_gate.c
> > > +++ b/net/sched/act_gate.c
> >
> > [ ... ]
> >
> > > @@ -509,10 +621,9 @@ static int tcf_gate_dump(struct sk_buff *skb, struct tc_action *a,
> > > struct nlattr *entry_list;
> > > struct tcf_t t;
> > >
> > > - spin_lock_bh(&gact->tcf_lock);
> > > - opt.action = gact->tcf_action;
> > > -
> > > - p = &gact->param;
> > > + rcu_read_lock();
> > > + opt.action = READ_ONCE(gact->tcf_action);
> > > + p = rcu_dereference(gact->param);
> > >
> > > if (nla_put(skb, TCA_GATE_PARMS, sizeof(opt), &opt))
> > > goto nla_put_failure;
> > >
> > > @@ -552,12 +663,12 @@ static int tcf_gate_dump(struct sk_buff *skb, struct tc_action *a,
> > > tcf_tm_dump(&t, &gact->tcf_tm);
> > > if (nla_put_64bit(skb, TCA_GATE_TM, sizeof(t), &t, TCA_GATE_PAD))
> > > goto nla_put_failure;
> > > - spin_unlock_bh(&gact->tcf_lock);
> > > + rcu_read_unlock();
> > >
> > > return skb->len;
> > >
> > > nla_put_failure:
> > > - spin_unlock_bh(&gact->tcf_lock);
> > > + rcu_read_unlock();
> > > nlmsg_trim(skb, b);
> > > return -1;
> > > }
> >
> > Can tcf_gate_dump be called before gact->param is initialized?
> >
> > After the conversion to RCU pointers, if tcf_gate_dump is called during
> > action initialization (between tcf_idr_create_from_flags and
> > rcu_replace_pointer in tcf_gate_init), p will be NULL:
> >
> > net/sched/act_gate.c:tcf_gate_init() {
> > tcf_idr_create_from_flags(...);
> > ...
> > // gact->param still NULL here
> > ...
> > rcu_replace_pointer(gact->param, p, ...); // line 517
> > }
> >
> > net/sched/act_gate.c:tcf_gate_dump() {
> > rcu_read_lock();
> > p = rcu_dereference(gact->param); // can be NULL
> >
> > if (nla_put_u64_64bit(skb, TCA_GATE_BASE_TIME,
> > p->tcfg_basetime, TCA_GATE_PAD)) // NULL deref
> > }
> >
> > Before the patch, param was an embedded struct that was zero-initialized,
> > so dump would return valid (if meaningless) data. After the patch, this
> > becomes a NULL pointer dereference.
> >
> > While the tc action framework may prevent dumping partially initialized
> > actions through external synchronization, should this code include a NULL
> > check to make the initialization window safer?
> >
> > The second issue is in the hardware offload path. Looking at
> > net/sched/act_gate.c:tcf_gate_offload_act_setup(), the function calls the
> > accessor helpers without holding tcf_lock:
> >
> > net/sched/act_gate.c:tcf_gate_offload_act_setup() {
> > // no lock held here
> > entry->gate.prio = tcf_gate_prio(act); // line 723
> > entry->gate.basetime = tcf_gate_basetime(act);
> > entry->gate.cycletime = tcf_gate_cycletime(act);
> > ...
> > }
> >
> > include/net/tc_act/tc_gate.h:tcf_gate_prio() {
> > p = tcf_gate_params_locked(a); // uses rcu_dereference_protected
> > ...
> > }
> >
> > include/net/tc_act/tc_gate.h:tcf_gate_params_locked() {
> > return rcu_dereference_protected(gact->param,
> > lockdep_is_held(&gact->tcf_lock));
> > }
> >
> > Does tcf_gate_offload_act_setup violate the locking requirements?
> >
> > The accessor functions all use tcf_gate_params_locked(), which requires
> > tcf_lock to be held (verified via lockdep_is_held). The offload setup
> > function doesn't acquire this lock before calling the accessors.
> >
> > With lockdep enabled, this will trigger warnings. Without lockdep, there's
> > a race where param can be replaced via rcu_replace_pointer while the
> > offload function is reading it, potentially causing reads of inconsistent
> > state.
> >
> > The dump path uses the correct pattern with rcu_read_lock() and
> > rcu_dereference(). Should the offload path either acquire tcf_lock or use
> > a similar RCU-only approach?
> >