Re: [PATCH] net: fix memory leaks in flush_backlog() with RPS

From: Eric Dumazet
Date: Fri May 01 2020 - 23:32:29 EST




On 5/1/20 8:15 PM, Qian Cai wrote:
> netif_receive_skb_list_internal() could call enqueue_to_backlog() to put
> some skb to softnet_data.input_pkt_queue and then in
> ip_route_input_slow(), it allocates a dst_entry to be used in
> skb_dst_set(). Later,
>
> cleanup_net
> default_device_exit_batch
> unregister_netdevice_many
> rollback_registered_many
> flush_all_backlogs
>
> will call flush_backlog() for all CPUs which would call kfree_skb() for
> each skb on the input_pkt_queue without calling skb_dst_drop() first.
>
> unreferenced object 0xffff97008e4c4040 (size 176):
> comm "softirq", pid 0, jiffies 4295173845 (age 32012.550s)
> hex dump (first 32 bytes):
> 00 d0 a5 74 04 97 ff ff 40 72 1a 96 ff ff ff ff ...t....@xxxxxxx
> c1 a3 c5 95 ff ff ff ff 00 00 00 00 00 00 00 00 ................
> backtrace:
> [<0000000030483fae>] kmem_cache_alloc+0x184/0x430
> [<000000007ae17545>] dst_alloc+0x8e/0x128
> [<000000001efe9a1f>] rt_dst_alloc+0x6f/0x1e0
> rt_dst_alloc at net/ipv4/route.c:1628
> [<00000000e67d4dac>] ip_route_input_rcu+0xdfe/0x1640
> ip_route_input_slow at net/ipv4/route.c:2218
> (inlined by) ip_route_input_rcu at net/ipv4/route.c:2348
> [<000000009f30cbc0>] ip_route_input_noref+0xab/0x1a0
> [<000000004f53bd04>] arp_process+0x83a/0xf50
> arp_process at net/ipv4/arp.c:813 (discriminator 1)
> [<0000000061fd547d>] arp_rcv+0x276/0x330
> [<0000000007dbfa7a>] __netif_receive_skb_list_core+0x4d2/0x500
> [<0000000062d5f6d2>] netif_receive_skb_list_internal+0x4cb/0x7d0
> [<000000002baa2b74>] gro_normal_list+0x55/0xc0
> [<0000000093d04885>] napi_complete_done+0xea/0x350
> [<00000000467dd088>] tg3_poll_msix+0x174/0x310 [tg3]
> [<00000000498af7d9>] net_rx_action+0x278/0x890
> [<000000001e81d7e6>] __do_softirq+0xd9/0x589
> [<00000000087ee354>] irq_exit+0xa2/0xc0
> [<000000001c4db0cd>] do_IRQ+0x87/0x180
>
> Signed-off-by: Qian Cai <cai@xxxxxx>
> ---
> net/core/dev.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 522288177bbd..b898cd3036da 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5496,6 +5496,7 @@ static void flush_backlog(struct work_struct *work)
> skb_queue_walk_safe(&sd->input_pkt_queue, skb, tmp) {
> if (skb->dev->reg_state == NETREG_UNREGISTERING) {
> __skb_unlink(skb, &sd->input_pkt_queue);
> + skb_dst_drop(skb);
> kfree_skb(skb);
> input_queue_head_incr(sd);
> }
>


kfree_skb() is supposed to call skb_dst_drop() (look in skb_release_head_state())

If you think about it, we would have hundreds of similar bugs if this was not the case.