Re: [PATCH] tcp: Do not reset the icsk_ca_initialized in tcp_init_transfer.
From: Nguyen Dinh Phi
Date: Tue Jun 29 2021 - 08:28:30 EST
On June 29, 2021 4:21:59 PM GMT+08:00, Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
>On Tue, Jun 29, 2021 at 9:17 AM Nguyen Dinh Phi <phind.uet@xxxxxxxxx>
>wrote:
>>
>> On June 29, 2021 1:20:19 AM GMT+08:00, Neal Cardwell
><ncardwell@xxxxxxxxxx> wrote:
>> >)
>> >
>> >On Mon, Jun 28, 2021 at 1:15 PM Phi Nguyen <phind.uet@xxxxxxxxx>
>wrote:
>> >>
>> >> On 6/29/2021 12:24 AM, Neal Cardwell wrote:
>> >>
>> >> > Thanks.
>> >> >
>> >> > Can you also please provide a summary of the event sequence that
>> >> > triggers the bug? Based on your Reported-by tag, I guess this is
>> >based
>> >> > on the syzbot reproducer:
>> >> >
>> >> >
>>
>>https://groups.google.com/g/syzkaller-bugs/c/VbHoSsBz0hk/m/cOxOoTgPCAAJ
>> >> >
>> >> > but perhaps you can give a summary of the event sequence that
>> >causes
>> >> > the bug? Is it that the call:
>> >> >
>> >> > setsockopt$inet_tcp_TCP_CONGESTION(r0, 0x6, 0xd,
>> >> > &(0x7f0000000000)='cdg\x00', 0x4)
>> >> >
>> >> > initializes the CC and happens before the connection is
>> >established,
>> >> > and then when the connection is established, the line that sets:
>> >> > icsk->icsk_ca_initialized = 0;
>> >> > is incorrect, causing the CC to be initialized again without
>first
>> >> > calling the cleanup code that deallocates the CDG-allocated
>memory?
>> >> >
>> >> > thanks,
>> >> > neal
>> >> >
>> >>
>> >> Hi Neal,
>> >>
>> >> The gdb stack trace that lead to init_transfer_input() is as
>bellow,
>> >the
>> >> current sock state is TCP_SYN_RECV.
>> >
>> >Thanks. That makes sense as a snapshot of time for
>> >tcp_init_transfer(), but I think what would be more useful would be
>a
>> >description of the sequence of events, including when the CC was
>> >initialized previous to that point (as noted above, was it that the
>> >setsockopt(TCP_CONGESTION) completed before that point?).
>> >
>> >thanks,
>> >neal
>>
>> I resend my message because I accidently used html format in last
>one. I am very sorry for the inconvenience caused.
>> ---
>> Yes, the CC had been initialized by the setsockopt, after that, it
>was initialized again in function tcp_init_transfer() because of
>setting isck_ca_initialized to 0.
>
>"the setsockopt" is rather vague, sorry.
>
>
>The hard part is that all scenarios have to be considered.
>
>TCP flows can either be passive and active.
>
>CC can be set :
>
>1) Before the connect() or accept()
>2) After the connect() or accept()
>3) after the connect() but before 3WHS is completed.
>
>So we need to make sure all cases will still work with any combination
>of CDG CC (before/after) in the picture.
>
>Note that a memory leak for a restricted CC (CDG can only be used by
>CAP_NET_ADMIN privileged user)
> is a small problem compared to more serious bug that could be added
>by an incomplete fix.
>
>I also note that if icsk_ca_priv] was increased from 104 to 120 bytes,
>tcp_cdg would no longer need a dynamic memory allocation.
>
>Thank you.
Hi,
I will try to see whether I am able to get the full sequence. I am also affraid of making a change that could affect big part of the kernel.
About CDG, how we can get rid of dynamic allocation by increasing icsk_priv_data to 120? because I see that the window size is a module parameter, so I guess it is not a fixed value.
Because the problem only happens with CDG, is adding check in its tcp_cdg_init() function Ok? And about icsk_ca_initialized, Could I expect it to be 0 in CC's init functions?
Thank you.