Re: [PATCH] can: j1939: fix memory leak of skbs

From: Oleksij Rempel
Date: Fri Jul 29 2022 - 00:23:09 EST


Hi Fedor,

thank you for work.

On Fri, Jul 08, 2022 at 08:59:49PM +0300, Fedor Pchelkin wrote:
> Syzkaller reported memory leak of skbs introduced with the commit
> 2030043e616c ("can: j1939: fix Use-after-Free, hold skb ref while in use").
>
> Link to Syzkaller info and repro: https://forge.ispras.ru/issues/11743
>
> The suggested solution was tested on the new memory-leak Syzkaller repro
> and on the old use-after-free repro (that use-after-free bug was solved
> with aforementioned commit). Although there can probably be another
> situations when the numbers of skb_get() and skb_unref() calls don't match
> and I don't see it in right way.
>
> Moreover, skb_unref() call can be harmlessly removed from line 338 in
> j1939_session_skb_drop_old() (/net/can/j1939/transport.c). But then I
> assume this removal ruins the whole reference counts logic...
>
> Overall, there is definitely something not clear in skb reference counts
> management with skb_get() and skb_unref(). The solution we suggested fixes
> the leaks and use-after-free's induced by Syzkaller but perhaps the origin
> of the problem can be somewhere else.
>
> Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
> Signed-off-by: Fedor Pchelkin <pchelkin@xxxxxxxxx>
> Signed-off-by: Alexey Khoroshilov <khoroshilov@xxxxxxxxx>
> ---
> net/can/j1939/transport.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
> index 307ee1174a6e..9600b339cbf8 100644
> --- a/net/can/j1939/transport.c
> +++ b/net/can/j1939/transport.c
> @@ -356,7 +356,6 @@ void j1939_session_skb_queue(struct j1939_session *session,
>
> skcb->flags |= J1939_ECU_LOCAL_SRC;
>
> - skb_get(skb);
> skb_queue_tail(&session->skb_queue, skb);
> }

This skb_get() is counter part of skb_unref()
j1939_session_skb_drop_old().

Initial issue can be reproduced by using real (slow) CAN with j1939cat[1]
tool. Both parts should be started to make sure the j1939_session_tx_dat() will
actually start using the queue. After pushing about 100K of data, application
will try to close the socket and exit. After socket is closed, all skb related
to this socket will be freed and j1939_session_tx_dat() will use freed skbs.

NACK for this patch.

1. https://github.com/linux-can/can-utils/blob/master/j1939cat.c
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |