Re: [PATCH] tcp: Do not reset the icsk_ca_initialized in tcp_init_transfer.

From: Neal Cardwell
Date: Thu Jul 01 2021 - 10:23:37 EST


On Wed, Jun 30, 2021 at 2:25 PM Phi Nguyen <phind.uet@xxxxxxxxx> wrote:
>
> On 6/29/2021 11:59 PM, Neal Cardwell wrote:
> > On Tue, Jun 29, 2021 at 8:58 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> > From my perspective, the bug was introduced when that 8919a9b31eb4
> > commit introduced icsk_ca_initialized and set icsk_ca_initialized to 0
> > in tcp_init_transfer(), missing the possibility that a process could
> > call setsockopt(TCP_CONGESTION) in state TCP_SYN_SENT (i.e. after the
> > connect() or TFO open sendmsg()), which would call
> > tcp_init_congestion_control(). The 8919a9b31eb4 commit did not intend
> > to reset any initialization that the user had already explicitly made;
> > it just missed the possibility of that particular sequence (which
> > syzkaller managed to find!).
> >
> >> Although I am not sure what happens at accept() time when the listener
> >> socket is cloned.
> >
> > It seems that for listener sockets, they cannot initialize their CC
> > module state, because there is no way for them to reach
> > tcp_init_congestion_control(), since:
> >
> > (a) tcp_set_congestion_control() -> tcp_reinit_congestion_control()
> > will not call tcp_init_congestion_control() on a socket in CLOSE or
> > LISTEN
> >
> > (b) tcp_init_transfer() -> tcp_init_congestion_control() can only
> > happen for established sockets and successful TFO SYN_RECV sockets
> Is this what was mentioned in this commit ce69e563b325(tcp: make sure
> listeners don't initialize congestion-control state)

Yes, exactly.

> > --
> > [PATCH] tcp: fix tcp_init_transfer() to not reset icsk_ca_initialized
> >
> > This commit fixes a bug (found by syzkaller) that could cause spurious
> > double-initializations for congestion control modules, which could cause memory
> > leaks orother problems for congestion control modules (like CDG) that allocate
> > memory in their init functions.
> >
> > The buggy scenario constructed by syzkaller was something like:
> >
> > (1) create a TCP socket
> > (2) initiate a TFO connect via sendto()
> > (3) while socket is in TCP_SYN_SENT, call setsockopt(TCP_CONGESTION),
> > which calls:
> > tcp_set_congestion_control() ->
> > tcp_reinit_congestion_control() ->
> > tcp_init_congestion_control()
> > (4) receive ACK, connection is established, call tcp_init_transfer(),
> > set icsk_ca_initialized=0 (without first calling cc->release()),
> > call tcp_init_congestion_control() again.
> >
> > Note that in this sequence tcp_init_congestion_control() is called twice
> > without a cc->release() call in between. Thus, for CC modules that allocate
> > memory in their init() function, e.g, CDG, a memory leak may occur. The
> > syzkaller tool managed to find a reproducer that triggered such a leak in CDG.
> >
> > The bug was introduced when that 8919a9b31eb4 commit introduced
> > icsk_ca_initialized and set icsk_ca_initialized to 0 in tcp_init_transfer(),
> > missing the possibility for a sequence like the one above, where a process
> > could call setsockopt(TCP_CONGESTION) in state TCP_SYN_SENT (i.e. after the
> > connect() or TFO open sendmsg()), which would call
> > tcp_init_congestion_control(). The 8919a9b31eb4 commit did not intend to reset
> > any initialization that the user had already explicitly made; it just missed
> > the possibility of that particular sequence (which syzkaller managed to find).
>
> Could I use your commit message when I resubmit patch?

Yes, feel free to use that commit message verbatim or modified.

thanks,
neal