RE: [PATCH v2 net 1/1] net/sched: sch_dualpi2: fix limit/memlimit enforcement when dequeueing L-queue

From: Chia-Yu Chang (Nokia)

Date: Thu Apr 16 2026 - 14:31:15 EST


> -----Original Message-----
> From: Stephen Hemminger <stephen@xxxxxxxxxxxxxxxxxx>
> Sent: Thursday, April 16, 2026 7:55 PM
> To: Chia-Yu Chang (Nokia) <chia-yu.chang@xxxxxxxxxxxxxxxxxxx>
> Cc: victor@xxxxxxxxxxxx; hxzene@xxxxxxxxx; linux-hardening@xxxxxxxxxxxxxxx; kees@xxxxxxxxxx; gustavoars@xxxxxxxxxx; jhs@xxxxxxxxxxxx; jiri@xxxxxxxxxxx; davem@xxxxxxxxxxxxx; edumazet@xxxxxxxxxx; kuba@xxxxxxxxxx; pabeni@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; horms@xxxxxxxxxx; ij@xxxxxxxxxx; ncardwell@xxxxxxxxxx; Koen De Schepper (Nokia) <koen.de_schepper@xxxxxxxxxxxxxxxxxxx>; g.white@xxxxxxxxxxxxx; ingemar.s.johansson@xxxxxxxxxxxx; mirja.kuehlewind@xxxxxxxxxxxx; cheshire@xxxxxxxxx; rs.ietf@xxxxxx; Jason_Livingood@xxxxxxxxxxx; vidhi_goel@xxxxxxxxx
> Subject: Re: [PATCH v2 net 1/1] net/sched: sch_dualpi2: fix limit/memlimit enforcement when dequeueing L-queue
>
>
> CAUTION: This is an external email. Please be very careful when clicking links or opening attachments. See the URL nok.it/ext for additional information.
>
>
>
> On Thu, 16 Apr 2026 19:09:06 +0200
> chia-yu.chang@xxxxxxxxxxxxxxxxxxx wrote:
>
> > From: Chia-Yu Chang <chia-yu.chang@xxxxxxxxxxxxxxxxxxx>
> >
> > Fix dualpi2_change() to correctly enforce updated limit and memlimit
> > values after a configuration change of the dualpi2 qdisc.
> >
> > Before this patch, dualpi2_change() always attempted to dequeue
> > packets via the root qdisc (C-queue) when reducing backlog or memory
> > usage, and unconditionally assumed that a valid skb will be returned.
> > When traffic classification results in packets being queued in the
> > L-queue while the C-queue is empty, this leads to a NULL skb
> > dereference during limit or memlimit enforcement.
> >
> > This is fixed by first dequeuing from the C-queue path if it is non-empty.
> > Once the C-queue is empty, packets are dequeued directly from the L-queue.
> > Return values from qdisc_dequeue_internal() are checked for both
> > queues. When dequeuing from the L-queue, the parent qdisc qlen and
> > backlog counters are updated explicitly to keep overall qdisc statistics consistent.
> >
> > Fixes: 320d031ad6e4 ("sched: Struct definition and parsing of dualpi2
> > qdisc")
> > Reported-by: "Kito Xu (veritas501)" <hxzene@xxxxxxxxx>
> > Signed-off-by: Chia-Yu Chang <chia-yu.chang@xxxxxxxxxxxxxxxxxxx>
> > ---
>
> I was a little concerned about the complexity of managing qlen here.
> But could not find anything obvious.

Hi Stephen,

This fix relies on some existing assmuptions of DualPI2.

>
> Turned to AI review and it found some things:
>
> Right fix direction and the reported crash is real. A few issues before this is ready:
>
> 1. The `c_len` construction is fragile. Declared `int`, initialized from a `u32 - u32`. If the invariant `qdisc_qlen(sch) >= qdisc_qlen(q->l_queue)` is ever violated, you get a large positive value, the C-queue branch is taken on an empty C-queue, `qdisc_dequeue_internal()` returns NULL, and the loop breaks out without draining the L-queue -- leaving the qdisc over limit. Simpler and more robust to just compare the two qlens directly and drop the delta variable entirely.
>

In current dequeue_packet() of DualPI2, we also calculate c_len via the same approach (line 524).

As we only have queue length of L-queue and both C- and L-queues, so this is the way we derive the queue length of C-queue.

> 2. Missing else/termination. If both branches' conditions are false (neither `c_len` nor `qdisc_qlen(q->l_queue)`) but the outer `while` still holds because `memory_used > memory_limit`, the loop spins forever. An explicit `else break;` guards against an accounting desync becoming a hang.
>

This shall not happen, but adding an extra else guard indeed is definitely a good suggestion.

> 3. Whitespace: two lines in the L-queue branch use spaces instead of tabs --
>
> + q->memory_used -= skb->truesize;
> + rtnl_qdisc_drop(skb, q->l_queue);
>
> checkpatch will flag this.

Sure, I will fix this, sorry for my miss.

>
> 4. Comment style. The three-line comment at the end of the L-queue branch doesn't follow the net subsystem multi-line comment style (leading ' * ' on continuation lines, closing ' */' on its own line).
> Once the code is cleaner, the comment could also just be dropped or shortened to one line.
>

Thanks, I will fix this as well.

> 5. The accounting in the L-queue branch is correct, but only if you trace the enqueue invariants carefully: L-queue packets are counted in
> *both* `sch` and `q->l_queue` on enqueue (see dualpi2_enqueue_skb lines 413-423), `qdisc_dequeue_internal(q->l_queue, true)` adjusts l_queue's side, and the explicit `--sch->q.qlen` + `qdisc_qstats_backlog_dec(sch, skb)` adjusts sch's side. Separately, the C-queue branch now quietly relies on the post-CVE-2025-39677 semantics of `qdisc_dequeue_internal()` handling parent backlog -- which is why the pre-patch `qdisc_qstats_backlog_dec(sch, skb)` could be removed.
> Neither of these load-bearing invariants is documented in the code or the commit message. Please add an inline comment in the L-queue branch explaining the double-count-on-enqueue, and mention the
> qdisc_dequeue_internal() dependency in the commit log.

Yes, L-queue packets are counted in both parent qdisc (sch) and child qdisc (q->l_queue) during enqueue.
And we re-use the qdisc_dequeue_internal() of sch_generic.h for C-queue case.

> 6. Commit message / subject. Subject reads as if only the L-queue path changed, but the whole drain loop was restructured. Something like
> "sch_dualpi2: drain both C-queue and L-queue in dualpi2_change()" would describe it better. Also, on NULL return from qdisc_dequeue_internal() the loop silently breaks -- if that ever triggers it means qdisc_qlen()
> > 0 but dequeue returned NULL, which is a real invariant violation.
> > Worth a WARN_ON_ONCE().
>
> Suggested shape:
>
> while (qdisc_qlen(sch) > sch->limit ||
> q->memory_used > q->memory_limit) {
> struct sk_buff *skb;
>
> if (qdisc_qlen(sch) > qdisc_qlen(q->l_queue)) {
> skb = qdisc_dequeue_internal(sch, true);
> if (!skb)
> break;
> q->memory_used -= skb->truesize;
> rtnl_qdisc_drop(skb, sch);
> } else if (qdisc_qlen(q->l_queue)) {
> skb = qdisc_dequeue_internal(q->l_queue, true);
> if (!skb)
> break;
> /* L-queue packets are counted in both sch and
> * l_queue on enqueue; qdisc_dequeue_internal()
> * handled l_queue, account sch here.
> */
> sch->q.qlen--;
> qdisc_qstats_backlog_dec(sch, skb);
> q->memory_used -= skb->truesize;
> rtnl_qdisc_drop(skb, q->l_queue);
> qdisc_qstats_drop(sch);
> } else {
> break;
> }
> }
>
>
> As with any AI feedback, expect it to generate hints but also be wrong.

I am ok with this suggestion and I will take action in v3.

But I would say the origianl c_len calculation already existed in dualpi2 of dequeue_packet().

And this is because we maintained parent and child qdisc statistics during normal enqueue and dequeue operations.

Thanks!
Chia-Yu