[PATCH] net/sched: sch_qfq: account for stab overhead in qfq_enqueue

From: Mitchell Augustin
Date: Thu Aug 03 2023 - 21:38:54 EST


From cb3f87086b7d412df344f120ecd324412103c903 Thu Aug 3 19:28:04 2023
From: Mitchell Augustin <mitchell@xxxxxxxxxxxxxxxxxxxx>
Date: Thu, 3 Aug 2023 19:28:04 -0500
Subject: [PATCH] net/sched: sch_qfq: account for stab overhead in qfq_enqueue

[ Upstream commit 3e337087c3b5805fe0b8a46ba622a962880b5d64 ]

I'm backporting this patch from the mainline 6.5-rc4 branch (the above commit). This is my first "real" kernel patch, so please let me know if I have done anything incorrect here. Thanks!

Lion says:
-------
In the QFQ scheduler a similar issue to CVE-2023-31436
persists.

Consider the following code in net/sched/sch_qfq.c:

static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch,
                struct sk_buff **to_free)
{
     unsigned int len = qdisc_pkt_len(skb), gso_segs;

    // ...

     if (unlikely(cl->agg->lmax < len)) {
         pr_debug("qfq: increasing maxpkt from %u to %u for class %u",
              cl->agg->lmax, len, cl->common.classid);
         err = qfq_change_agg(sch, cl, cl->agg->class_weight, len);
         if (err) {
             cl->qstats.drops++;
             return qdisc_drop(skb, sch, to_free);
         }

    // ...

     }

Similarly to CVE-2023-31436, "lmax" is increased without any bounds
checks according to the packet length "len". Usually this would not
impose a problem because packet sizes are naturally limited.

This is however not the actual packet length, rather the
"qdisc_pkt_len(skb)" which might apply size transformations according to
"struct qdisc_size_table" as created by "qdisc_get_stab()" in
net/sched/sch_api.c if the TCA_STAB option was set when modifying the qdisc.

A user may choose virtually any size using such a table.

As a result the same issue as in CVE-2023-31436 can occur, allowing heap
out-of-bounds read / writes in the kmalloc-8192 cache.
-------

We can create the issue with the following commands:

tc qdisc add dev $DEV root handle 1: stab mtu 2048 tsize 512 mpu 0 \
overhead 999999999 linklayer ethernet qfq
tc class add dev $DEV parent 1: classid 1:1 htb rate 6mbit burst 15k
tc filter add dev $DEV parent 1: matchall classid 1:1
ping -I $DEV 1.1.1.2

This is caused by incorrectly assuming that qdisc_pkt_len() returns a
length within the QFQ_MIN_LMAX < len < QFQ_MAX_LMAX.
---
 net/sched/sch_qfq.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c
index c2a68f6e427e..81ebe7741463 100644
--- a/net/sched/sch_qfq.c
+++ b/net/sched/sch_qfq.c
@@ -116,6 +116,7 @@

 #define QFQ_MTU_SHIFT        16    /* to support TSO/GSO */
 #define QFQ_MIN_LMAX        512    /* see qfq_slot_insert */
+#define QFQ_MAX_LMAX        (1UL << QFQ_MTU_SHIFT)

 #define QFQ_MAX_AGG_CLASSES    8 /* max num classes per aggregate allowed */

@@ -387,8 +388,13 @@ static int qfq_change_agg(struct Qdisc *sch, struct qfq_class *cl, u32 weight,
                u32 lmax)
 {
     struct qfq_sched *q = qdisc_priv(sch);
-    struct qfq_aggregate *new_agg = qfq_find_agg(q, lmax, weight);
+    struct qfq_aggregate *new_agg;

+    /* 'lmax' can range from [QFQ_MIN_LMAX, pktlen + stab overhead] */
+    if (lmax > QFQ_MAX_LMAX)
+        return -EINVAL;
+
+    new_agg = qfq_find_agg(q, lmax, weight);
     if (new_agg == NULL) { /* create new aggregate */
         new_agg = kzalloc(sizeof(*new_agg), GFP_ATOMIC);
         if (new_agg == NULL)
--
2.34.1