Re: [net,v6,1/1] net/sched: act_gate: snapshot parameters with RCU on replace

From: Simon Horman

Date: Fri Feb 13 2026 - 10:56:06 EST


This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net/sched: act_gate: snapshot parameters with RCU on replace

The gate action can be replaced while the hrtimer callback or dump path is
walking the schedule list.

Convert the parameters to an RCU-protected snapshot and swap updates under
tcf_lock, freeing the previous snapshot via call_rcu(). When REPLACE omits
the entry list, preserve the existing schedule so the effective state is
unchanged.

> diff --git a/include/net/tc_act/tc_gate.h b/include/net/tc_act/tc_gate.h
> index c1a67149c6b6..5223c00279d5 100644
> --- a/include/net/tc_act/tc_gate.h
> +++ b/include/net/tc_act/tc_gate.h
> @@ -32,6 +32,7 @@ struct tcf_gate_params {
> s32 tcfg_clockid;
> size_t num_entries;
> struct list_head entries;
> + struct rcu_head rcu;
> };
>
> #define GATE_ACT_GATE_OPEN BIT(0)
> @@ -39,7 +40,7 @@ struct tcf_gate_params {
>
> struct tcf_gate {
> struct tc_action common;
> - struct tcf_gate_params param;
> + struct tcf_gate_params __rcu *param;
> u8 current_gate_status;
> ktime_t current_close_time;
> u32 current_entry_octets;
> @@ -51,47 +52,65 @@ struct tcf_gate {
>
> #define to_gate(a) ((struct tcf_gate *)a)
>
> +static inline struct tcf_gate_params *tcf_gate_params_locked(const struct tc_action *a)
> +{
> + struct tcf_gate *gact = to_gate(a);
> +
> + return rcu_dereference_protected(gact->param,
> + lockdep_is_held(&gact->tcf_lock));
> +}
> +
> static inline s32 tcf_gate_prio(const struct tc_action *a)
> {
> + struct tcf_gate_params *p;
> s32 tcfg_prio;
>
> - tcfg_prio = to_gate(a)->param.tcfg_priority;
> + p = tcf_gate_params_locked(a);
> + tcfg_prio = p->tcfg_priority;
>
> return tcfg_prio;
> }

[ ... ]

> diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c
> index c1f75f272757..60c80e609ec3 100644
> --- a/net/sched/act_gate.c
> +++ b/net/sched/act_gate.c

[ ... ]

> @@ -509,10 +621,9 @@ static int tcf_gate_dump(struct sk_buff *skb, struct tc_action *a,
> struct nlattr *entry_list;
> struct tcf_t t;
>
> - spin_lock_bh(&gact->tcf_lock);
> - opt.action = gact->tcf_action;
> -
> - p = &gact->param;
> + rcu_read_lock();
> + opt.action = READ_ONCE(gact->tcf_action);
> + p = rcu_dereference(gact->param);
>
> if (nla_put(skb, TCA_GATE_PARMS, sizeof(opt), &opt))
> goto nla_put_failure;
>
> @@ -552,12 +663,12 @@ static int tcf_gate_dump(struct sk_buff *skb, struct tc_action *a,
> tcf_tm_dump(&t, &gact->tcf_tm);
> if (nla_put_64bit(skb, TCA_GATE_TM, sizeof(t), &t, TCA_GATE_PAD))
> goto nla_put_failure;
> - spin_unlock_bh(&gact->tcf_lock);
> + rcu_read_unlock();
>
> return skb->len;
>
> nla_put_failure:
> - spin_unlock_bh(&gact->tcf_lock);
> + rcu_read_unlock();
> nlmsg_trim(skb, b);
> return -1;
> }

Can tcf_gate_dump be called before gact->param is initialized?

After the conversion to RCU pointers, if tcf_gate_dump is called during
action initialization (between tcf_idr_create_from_flags and
rcu_replace_pointer in tcf_gate_init), p will be NULL:

net/sched/act_gate.c:tcf_gate_init() {
tcf_idr_create_from_flags(...);
...
// gact->param still NULL here
...
rcu_replace_pointer(gact->param, p, ...); // line 517
}

net/sched/act_gate.c:tcf_gate_dump() {
rcu_read_lock();
p = rcu_dereference(gact->param); // can be NULL

if (nla_put_u64_64bit(skb, TCA_GATE_BASE_TIME,
p->tcfg_basetime, TCA_GATE_PAD)) // NULL deref
}

Before the patch, param was an embedded struct that was zero-initialized,
so dump would return valid (if meaningless) data. After the patch, this
becomes a NULL pointer dereference.

While the tc action framework may prevent dumping partially initialized
actions through external synchronization, should this code include a NULL
check to make the initialization window safer?

The second issue is in the hardware offload path. Looking at
net/sched/act_gate.c:tcf_gate_offload_act_setup(), the function calls the
accessor helpers without holding tcf_lock:

net/sched/act_gate.c:tcf_gate_offload_act_setup() {
// no lock held here
entry->gate.prio = tcf_gate_prio(act); // line 723
entry->gate.basetime = tcf_gate_basetime(act);
entry->gate.cycletime = tcf_gate_cycletime(act);
...
}

include/net/tc_act/tc_gate.h:tcf_gate_prio() {
p = tcf_gate_params_locked(a); // uses rcu_dereference_protected
...
}

include/net/tc_act/tc_gate.h:tcf_gate_params_locked() {
return rcu_dereference_protected(gact->param,
lockdep_is_held(&gact->tcf_lock));
}

Does tcf_gate_offload_act_setup violate the locking requirements?

The accessor functions all use tcf_gate_params_locked(), which requires
tcf_lock to be held (verified via lockdep_is_held). The offload setup
function doesn't acquire this lock before calling the accessors.

With lockdep enabled, this will trigger warnings. Without lockdep, there's
a race where param can be replaced via rcu_replace_pointer while the
offload function is reading it, potentially causing reads of inconsistent
state.

The dump path uses the correct pattern with rcu_read_lock() and
rcu_dereference(). Should the offload path either acquire tcf_lock or use
a similar RCU-only approach?