Re: Re: [PATCH,net-next] tcp: Add TCP ROCCET congestion control module.

From: Neal Cardwell

Date: Tue Apr 14 2026 - 11:27:04 EST

On Tue, Apr 14, 2026 at 7:23 AM Lukas Prause
<lukas.prause@xxxxxxxxxxxxxxxxxxx> wrote:
>
> Thanks for the very detailed review of our code.
> We will incorporate your comments regarding documentation and variable
> usage into a new version of our code.

Sounds good. Thank you.

> > Please reference figures in the paper and mention specific concrete
> > numerical examples of latency reductions to quantify these statements.
>
> Figures 5 and 6 show the performance of ROCCET in stationary and mobile
> scenarios (https://arxiv.org/pdf/2510.25281). In the analyzed scenario,
> we have observed a lower sRTT with ROCCET than with BBRv3 and CUBIC. The
> observed throughput was marginally lower than that of BBRv3, but still
> on a similar level. A detailed quantitative evaluation can be found in
> the paper in sections VI and VII.

In https://arxiv.org/pdf/2510.25281 zooming into the Figure 6 sRTT
box-and-whisker-plot seems to show that BBRv3 actually has a lower
median sRTT value than ROCCET. So that statement seems misleading?

I would recommend using numerical examples in the commit message to
quantify the gains from ROCCET and avoid potential issues from visual
interpretation of graphs.

> > Can you please elaborate on this statement here? AFAICT from figures 7
> > and 8 in https://arxiv.org/pdf/2510.25281 it seems ROCCET is
> > essentially starved by CUBIC when sharing a bottleneck with CUBIC when
> > the bottleneck has 2*BDP or more of buffering. AFAICT it sounds like
> > ROCCET does have "fairness issues when sharing a link with TCP CUBIC"?
>
> Our main use case is a connection where the bottleneck link is in the
> cellular network, where the bottleneck queue is typically not shared
> between flows. "Fairness" between flows is being implemented by the base
> station's scheduler. In this scenario, ROCCET achieves its objective to
> not "bloat" its own queue.
>
> We have performed additional fairness experiments in non-cellular
> networks (figures 7 and 8). Here we show that even when used in other
> types of networks, ROCCET does not cause harm (see
> https://dl.acm.org/doi/10.1145/3365609.3365855) to other congestion control.

I do not see you objecting to my statement, "it seems ROCCET is
essentially starved by CUBIC when sharing a bottleneck with CUBIC when
the bottleneck has 2*BDP or more of buffering." So I guess you agree.

IMHO it's important to keep in mind that a congestion control that
starves in the presence of CUBIC may have limited deployment. This is
a key reason why Vegas was never deployed at scale.

> > Please specify what side effect or side effects ROCCET is claiming to
> > solve (presumably bufferbloat?).
> The side effect we observe in cellular networks is that, in particular,
> for loss-based congestion control, the cwnd often gets 'frozen' at a
> size that is too large for the BDP of the current link. This effect is
> caused by the TCP cwnd validation, which at some point stops increasing
> the cwnd because it assumes that the sender is application-limited.
> However, this often leads to a cwnd size that is too large for the link,
> but too small to cause a congestion event by overfilling the buffer. The
> result is a standing queue that causes permanently high RTTs. Figure 2
> in the paper (https://arxiv.org/pdf/2510.25281) shows the described
> behaviour for a single TCP CUBIC flow.

OK, so that sounds like you are describing the standard bufferbloat
problem. So you could replace the phrase "solves an unwanted side
effects of CUBIC’s implementation" in your comment with something
like: "avoids the bufferbloat problems inherent in CUBIC."

> > Expressed in isolation like this, that sounds potentially dangerous.
> > Please mention what signal(s) ROCCET uses to exit slow start if it's
> > not using loss.
> >
> > In addition, from reading the code AFAICT the connection does use loss
> > to exit slow start (see my remarks below in this message). So AFAICT
> > this summary seems inaccurate, or at least misleading?
> You are right, the summary is misleading. In the code we submitted,
> there are three conditions for exiting slow start:
> The first one is packet loss (as you already mentioned, without a cwnd
> reduction) Second is if the srRTT calculated by ROCCET exceeds an upper
> bound and ACK rate, sampled in 100ms time intervals, differs by 10
> segments. The third one is when the growth of the cwnd is stopped by the
> TCP cwnd validation (which considers the connection as
> application-limited).

OK, thanks for clarifying.

> > If no lower RTT is found for 10 seconds, the algorithm interpolates
> > the `min_rtt` upwards towards the current RTT.
> >
> > + If the path is persistently congested (e.g., a large buffer is
> > constantly full), the `min_rtt` baseline will drift up.
> >
> > + This makes the algorithm less sensitive to queueing delay over
> > time, potentially defeating the purpose of reducing bufferbloat in the
> > long run. Contrast this with BBR, which actively drains the queue
> > (using the ProbeRTT mechanism) to try to find the true physical
> > minimum RTT.
> >
> > Can you please add a comment explaining why the ROCCET algorithm takes
> > this approach, and how the algorithm expects to avoid queues that
> > ratchet ever higher?
> We added this functionality for the edge case of long-lived fat flows,
> which are experiencing routing changes, to detect a higher base RTT.
> Since this functionality is disabled by default and can also cause
> problems with min_RTT detection, we have decided to remove it.
> The measurement results in our paper have been obtained with this
> functionality disabled.

Again, thanks for clarifying.

> > Here, `cnt` is incremented by `1` on every call, regardless of the
> > `acked` value (number of packets ACKed in this event).
> You are right, we will change this.

Great. Thanks.

> > + With the default `ack_rate_diff_ca` of `200`, this condition will
> > become true for $sum_cwnd * 100 / sum_acked >= 200$, i.e.
> > $num_acks_per_round * 100 >= 200$. So AFAICT we expect this condition
> > to be true if there are 2 or more ACKs in a round trip. This makes
> > `bw_limit_detect` effectively a no-op or always-on trigger rather than
> > a true detector of queue growth or bandwidth limits.
> The purpose of this part of the code was to detect an increasing queue
> by monitoring data sent and acknowledged in combination with an
> increasing sRTT over 5 RTT time intervals. In the steady state of a TCP
> connection, the sending rate of the TCP sender should be equal to the
> receiver's ack rate, due to TCP self-clocking. The idea behind this code
> was to check if the cwnd is still correlated to the sending rate. If
> this is not the case and we also observe increasing RTTs, we assume the
> TCP sender is filling a buffer. However, we have made a mistake when
> calculating sum_cwnd:
> We are accumulating the cwnd on each ack event, instead of each RTT,
> which, as you mentioned, would make more sense. Because this leads to
> the erroneous behaviour that you described, we will remove this part of
> the code for now until we have evaluated the intended implementation.

Sounds good. Thanks.

> > Did the experiments in the paper use the approach documented in the
> > paper, or the approach documented in this code? They are very
> > different, AFAICT.
> The experiments were performed using the submitted code. This means that
> the mentioned code snippet always evaluates to true, so that ROCCET only
> reacts to changes in latency, which is different from what we described
> in the paper.

Got it. Thanks.

> > Having a module parameter to ignore loss in this way makes it too easy
> > for users to cause excessive congestion. I would urge you to remove
> > that module parameter. Researchers can add that sort of mechanism in
> > their own code for research.
> That is true, we will remove this part of the implementation.

Sounds good.

Thanks!

neal