Re: [PATCH net] net: sched: act_api: size RTM_GETACTION reply by fill size

From: Paul Moses

Date: Fri Jan 30 2026 - 12:23:59 EST


Yes, In net/sched/act_api.c the GETACTION notify path always does alloc_skb(NLMSG_GOODSIZE), if tca_get_fill()
runs out of tailroom it returns -1 and tcf_get_notify() maps that to -EINVAL. So failures are size-dependent
and can look intermittent across different action dumps. act_gate might be the outlier?

The size is already computed in tca_action_gd() (sum tcf_action_fill_size() then tcf_action_full_attrs_size())
and add/del already allocate max(attr_size, NLMSG_GOODSIZE). This patch just makes GETACTION consistent with
that.

On the reproducer: the gatebench test with 100 entries is reasonable.
https://raw.githubusercontent.com/jopamo/gatebench/refs/heads/main/src/selftests/test_large_dump.c

I plan to follow this up with another patch for act_gate and believe they both are integral to fully stabilize
act_gate.

Thanks
Paul



On Friday, January 30th, 2026 at 10:05 AM, Jamal Hadi Salim <jhs@xxxxxxxxxxxx> wrote:

>
>
> On Fri, Jan 30, 2026 at 8:43 AM Paul Moses p@xxxxxxx wrote:
>
> > tcf_action_fill_size() already computes the required dump size, but
> > RTM_GETACTION replies always allocate NLMSG_GOODSIZE. Large action
> > state can overrun that skb and make dumps fail.
> >
> > Use the computed reply size for RTM_GETACTION replies so large actions
> > can be dumped, while still keeping NLMSG_GOODSIZE as a floor.
> >
> > Fixes: 4e76e75d6aba ("net sched actions: calculate add/delete event message size")
> > Cc: stable@xxxxxxxxxxxxxxx
> > Signed-off-by: Paul Moses p@xxxxxxx
> > ---
> > net/sched/act_api.c | 7 ++++---
> > 1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> > index e1ab0faeb8113..8ab016d352850 100644
> > --- a/net/sched/act_api.c
> > +++ b/net/sched/act_api.c
> > @@ -1685,12 +1685,12 @@ static int tca_get_fill(struct sk_buff *skb, struct tc_action *actions[],
> >
> > static int
> > tcf_get_notify(struct net *net, u32 portid, struct nlmsghdr *n,
> > - struct tc_action *actions[], int event,
> > + struct tc_action *actions[], int event, size_t attr_size,
> > struct netlink_ext_ack *extack)
> > {
> > struct sk_buff *skb;
> >
> > - skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL);
> > + skb = alloc_skb(max_t(size_t, attr_size, NLMSG_GOODSIZE), GFP_KERNEL);
> > if (!skb)
> > return -ENOBUFS;
> > if (tca_get_fill(skb, actions, portid, n->nlmsg_seq, 0, event,
> > @@ -2041,7 +2041,8 @@ tca_action_gd(struct net *net, struct nlattr *nla, struct nlmsghdr *n,
> > attr_size = tcf_action_full_attrs_size(attr_size);
> >
> > if (event == RTM_GETACTION)
> > - ret = tcf_get_notify(net, portid, n, actions, event, extack);
> > + ret = tcf_get_notify(net, portid, n, actions, event,
> > + attr_size, extack);
> > else { /* delete */
> > ret = tcf_del_notify(net, n, actions, portid, attr_size, extack);
> > if (ret)
>
>
> dunno. Is this based on some issue you found? This is a common pattern
> in a lot of places in the stack and has not caused any issues (afaik).
>
> cheers,
> jamal