Re: [BUG] net/sched: skb leak with HTB + fq_codel on packet drops

From: Fernando Fernandez Mancera

Date: Tue Apr 07 2026 - 12:43:15 EST


On 4/6/26 4:29 PM, Damilola Bello wrote:
Description:

Using fq_codel as a child qdisc under HTB results in continuous growth
of skbuff_head_cache objects when packet drops occur. Memory is not
freed even after traffic stops, and the system can eventually run out
of memory.

Regression:

- Works on: 6.18.16

- Fails on: 6.19.x (tested on 6.19.10-200.fc43)

Environment:

- Kernel: 6.19.10-200.fc43

- Distro: Fedora 43

- NICs: ens2f0np0, ens2f1np1

- GRO/GSO/TSO: disabled

Reproduction:

#!/bin/sh
DEVS="ens2f0np0 ens2f1np1"
for DEV in $DEVS; do
tc qdisc del dev $DEV root 2>/dev/null
tc qdisc add dev $DEV root handle 1: htb default 10
tc class add dev $DEV parent 1: classid 1:10 htb rate 100mbit
tc qdisc add dev $DEV parent 1:10 handle 10: fq_codel
tc filter add dev $DEV parent 1: matchall flowid 1:10
done

Generate traffic exceeding 100mbit (e.g., iperf3) to force drops.


Hi,

I managed to reproduce this and did a bisect. This commit introduces the issue:

commit a6efc273ab8245722eee2150fa12cf75781dc410
Author: Eric Dumazet <edumazet@xxxxxxxxxx>
Date: Fri Nov 21 08:32:56 2025 +0000

net_sched: use qdisc_dequeue_drop() in cake, codel, fq_codel

cake, codel and fq_codel can drop many packets from dequeue().

Use qdisc_dequeue_drop() so that the freeing can happen
outside of the qdisc spinlock scope.

Add TCQ_F_DEQUEUE_DROPS to sch->flags.

Signed-off-by: Eric Dumazet <edumazet@xxxxxxxxxx>
Link: https://patch.msgid.link/20251121083256.674562-15-edumazet@xxxxxxxxxx
Signed-off-by: Paolo Abeni <pabeni@xxxxxxxxxx>


I tested this solution a bit and seems to be fine. We could probably just drop directly if the qdisc isn't the root.. but I believe this is cleaner and more future-proof.

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index c3d657359a3d..61ba54e909f2 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -1170,12 +1170,18 @@ static inline void tcf_kfree_skb_list(struct sk_buff *skb)
static inline void qdisc_dequeue_drop(struct Qdisc *q, struct sk_buff *skb,
enum skb_drop_reason reason)
{
+ struct Qdisc *root = qdisc_root_sleeping(q);
+
DEBUG_NET_WARN_ON_ONCE(!(q->flags & TCQ_F_DEQUEUE_DROPS));
DEBUG_NET_WARN_ON_ONCE(q->flags & TCQ_F_NOLOCK);

- tcf_set_drop_reason(skb, reason);
- skb->next = q->to_free;
- q->to_free = skb;
+ if (root->flags & TCQ_F_DEQUEUE_DROPS) {
+ tcf_set_drop_reason(skb, reason);
+ skb->next = q->to_free;
+ q->to_free = skb;
+ } else {
+ kfree_skb_reason(skb, reason);
+ }
}

/* Instead of calling kfree_skb() while root qdisc lock is held,

Of course, in this situation if the root qdisc does not support TCQ_F_DEQUEUE_DROPS flag then, the child won't use the optimization. If I am not wrong all the qdiscs that can be parent currently do not support TCQ_F_DEQUEUE_DROPS. Anyway, this generic fix is in my opinion cleaner than handling every caller.

I am sending this as patch for net tree.

Thanks,
Fernando.