Re: [PATCH net] can: isotp: isotp_rcv_cf(): fix so->rx race problem

From: Ziyang Xuan (William)
Date: Thu Jan 20 2022 - 20:50:44 EST


>
> On 20.01.22 12:28, Ziyang Xuan (William) wrote:
>>>
>>> On 20.01.22 07:24, Ziyang Xuan (William) wrote:
>>>
>>>> I have reproduced the syz problem with Marc's commit, the commit can not fix the panic problem.
>>>> So I tried to find the root cause for panic and gave my solution.
>>>>
>>>> Marc's commit just fix the condition that packet size bigger than INT_MAX which trigger
>>>> tpcon::{idx,len} integer overflow, but the packet size is 4096 in the syz problem.
>>>>
>>>> so->rx.len is 0 after the following logic in isotp_rcv_ff():
>>>>
>>>> /* get the FF_DL */
>>>> so->rx.len = (cf->data[ae] & 0x0F) << 8;
>>>> so->rx.len += cf->data[ae + 1];
>>>>
>>>> so->rx.len is 4096 after the following logic in isotp_rcv_ff():
>>>>
>>>> /* FF_DL = 0 => get real length from next 4 bytes */
>>>> so->rx.len = cf->data[ae + 2] << 24;
>>>> so->rx.len += cf->data[ae + 3] << 16;
>>>> so->rx.len += cf->data[ae + 4] << 8;
>>>> so->rx.len += cf->data[ae + 5];
>>>>
>>>
>>> In these cases the values 0 could be the minimum value in so->rx.len - but e.g. the value 0 can not show up in isotp_rcv_cf() as this function requires so->rx.state to be ISOTP_WAIT_DATA.
>>
>> Consider the scenario that isotp_rcv_cf() and isotp_rcv_cf() are concurrent for the same isotp_sock as following sequence:
>
> o_O
>
> Sorry but the receive path is not designed to handle concurrent receptions that would run isotp_rcv_cf() and isotp_rcv_ff() simultaneously.
>
>> isotp_rcv_cf()
>> if (so->rx.state != ISOTP_WAIT_DATA) [false]
>>                         isotp_rcv_ff()
>>                         so->rx.state = ISOTP_IDLE
>>                         /* get the FF_DL */ [so->rx.len == 0]
>> alloc_skb() [so->rx.len == 0]
>>                         /* FF_DL = 0 => get real length from next 4 bytes */ [so->rx.len == 4096]
>> skb_put(nskb, so->rx.len) [so->rx.len == 4096]
>> skb_over_panic()
>>
>
> Even though this case is not possible with a real CAN bus due to the CAN frame transmission times we could introduce some locking (or dropping of concurrent CAN frames) in isotp_rcv() - but this code runs in net softirq context ...
>

I thought the kernel code logic should make sure the kernel availability no matter what happens in
user space code. And tx path has considered so->tx race condition actually but rx path for so->rx.

> Regards,
> Oliver
>
>
>>>
>>> And when so->rx.len is 0 in isotp_rcv_ff() this check
>>>
>>> if (so->rx.len + ae + off + ff_pci_sz < so->rx.ll_dl)
>>>          return 1;
>>>
>>> will return from isotp_rcv_ff() before ISOTP_WAIT_DATA is set at the end. So after that above check we are still in ISOTP_IDLE state.
>>>
>>> Or did I miss something here?
>>>
>>>> so->rx.len is 0 before alloc_skb() and is 4096 after alloc_skb() in isotp_rcv_cf(). The following
>>>> skb_put() will trigger panic.
>>>>
>>>> The following log is my reproducing log with Marc's commit and my debug modification in isotp_rcv_cf().
>>>>
>>>> [  150.605776][    C6] isotp_rcv_cf: before alloc_skb so->rc.len: 0, after alloc_skb so->rx.len: 4096
>>>
>>>
>>> But so->rx_len is not a value that is modified by alloc_skb():
>>>
>>>                  nskb = alloc_skb(so->rx.len, gfp_any());
>>>                  if (!nskb)
>>>                          return 1;
>>>
>>>                  memcpy(skb_put(nskb, so->rx.len), so->rx.buf,
>>>                         so->rx.len);
>>>
>>>
>>> Can you send your debug modification changes please?
>>
>> My reproducing debug as attachment and following:
>>
>> diff --git a/net/can/isotp.c b/net/can/isotp.c
>> index df6968b28bf4..8b12d63b4d59 100644
>> --- a/net/can/isotp.c
>> +++ b/net/can/isotp.c
>> @@ -119,8 +119,8 @@ enum {
>>   };
>>
>>   struct tpcon {
>> -       int idx;
>> -       int len;
>> +       unsigned int idx;
>> +       unsigned int len;
>>          u32 state;
>>          u8 bs;
>>          u8 sn;
>> @@ -505,6 +505,7 @@ static int isotp_rcv_cf(struct sock *sk, struct canfd_frame *cf, int ae,
>>          struct isotp_sock *so = isotp_sk(sk);
>>          struct sk_buff *nskb;
>>          int i;
>> +       bool unexpection = false;
>>
>>          if (so->rx.state != ISOTP_WAIT_DATA)
>>                  return 0;
>> @@ -562,11 +563,13 @@ static int isotp_rcv_cf(struct sock *sk, struct canfd_frame *cf, int ae,
>>                                  sk_error_report(sk);
>>                          return 1;
>>                  }
>> -
>> +               if (so->rx.len == 0)
>> +                       unexpection = true;
>>                  nskb = alloc_skb(so->rx.len, gfp_any());
>>                  if (!nskb)
>>                          return 1;
>> -
>> +               if (unexpection)
>> +                       printk("%s: before alloc_skb so->rc.len: 0, after alloc_skb so->rx.len: %u\n", __func__, so->rx.len);
>>                  memcpy(skb_put(nskb, so->rx.len), so->rx.buf,
>>                         so->rx.len);
>>
>>
>>>
>>> Best regards,
>>> Oliver
>>>
>>>> [  150.611477][    C6] skbuff: skb_over_panic: text:ffffffff881ff7be len:4096 put:4096 head:ffff88807f93a800 data:ffff88807f93a800 tail:0x1000 end:0xc0 dev:<NULL>
>>>> [  150.615837][    C6] ------------[ cut here ]------------
>>>> [  150.617238][    C6] kernel BUG at net/core/skbuff.c:113!
>>>>
>>>
>>> .
> .