Re: [Linux Kernel Bug][ipv6/udp] memory leak in __ip6_append_data

From: Willem de Bruijn
Date: Mon Feb 05 2024 - 16:21:55 EST


Chenyuan Yang wrote:
> Hello Willem,
>
> Thanks for your reply!
>
> I double-checked the reproducer and ensured it could reproduce on the
> latest kernel (hash: 3eb5ca857d38ae7a694de6e59a3de7990af87919) with
> the config attached.
>
> ```
> root@syzkaller:~# gcc -pthread repro.c -o exe
> root@syzkaller:~# ./exe
> BUG: memory leak
> unreferenced object 0xffff88801a80c700 (size 240):
> comm "exe", pid 12074, jiffies 4295684229 (age 11.520s)
> hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 00 00 00 00 00 00 00 00 40 1a ad 1a 80 88 ff ff ........@.......
> backtrace:
> [<ffffffff81625419>] kmem_cache_alloc_node+0x2e9/0x440
> [<ffffffff83e747e7>] __alloc_skb+0x1f7/0x220
> [<ffffffff83e6a06b>] sock_omalloc+0x5b/0xa0
> [<ffffffff83e7d702>] msg_zerocopy_realloc+0xf2/0x340
> [<ffffffff8430d3a2>] __ip6_append_data.isra.0+0x1432/0x1e50
> [<ffffffff8430decf>] ip6_append_data+0x10f/0x2e0
> [<ffffffff84352bd1>] udpv6_sendmsg+0x851/0x1690
> [<ffffffff84305b39>] inet6_sendmsg+0x49/0x70
> [<ffffffff83e5e954>] __sock_sendmsg+0x54/0xb0
> [<ffffffff83e61982>] __sys_sendto+0x172/0x220
> [<ffffffff83e61a58>] __x64_sys_sendto+0x28/0x30
> [<ffffffff84ae676f>] do_syscall_64+0x3f/0x110
> [<ffffffff84c0008b>] entry_SYSCALL_64_after_hwframe+0x63/0x6b
>
> BUG: memory leak
> unreferenced object 0xffff88801f2e1400 (size 640):
> comm "exe", pid 12074, jiffies 4295684229 (age 11.520s)
> hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> backtrace:
> [<ffffffff81625419>] kmem_cache_alloc_node+0x2e9/0x440
> [<ffffffff83e70b20>] kmalloc_reserve+0xe0/0x170
> [<ffffffff83e746c1>] __alloc_skb+0xd1/0x220
> [<ffffffff83e6a06b>] sock_omalloc+0x5b/0xa0
> [<ffffffff83e7d702>] msg_zerocopy_realloc+0xf2/0x340
> [<ffffffff8430d3a2>] __ip6_append_data.isra.0+0x1432/0x1e50
> [<ffffffff8430decf>] ip6_append_data+0x10f/0x2e0
> [<ffffffff84352bd1>] udpv6_sendmsg+0x851/0x1690
> [<ffffffff84305b39>] inet6_sendmsg+0x49/0x70
> [<ffffffff83e5e954>] __sock_sendmsg+0x54/0xb0
> [<ffffffff83e61982>] __sys_sendto+0x172/0x220
> [<ffffffff83e61a58>] __x64_sys_sendto+0x28/0x30
> [<ffffffff84ae676f>] do_syscall_64+0x3f/0x110
> [<ffffffff84c0008b>] entry_SYSCALL_64_after_hwframe+0x63/0x6b
> ```
>
> The C reproducer needs some time while the syz program can reproduce
> the issue more quickly.
>
> Let me know if you need further information to reproduce or debug.
>
> Best,
> Chenyuan
>
> On Sun, Jan 28, 2024 at 4:07 PM Willem de Bruijn
> <willemdebruijn.kernel@xxxxxxxxx> wrote:
> >
> > Chenyuan Yang wrote:
> > > Dear Linux Developers for Ipv6 Network,
> > >
> > > We encountered "memory leak in __ip6_append_data" when testing the
> > > ipv6 udp socket with Syzkaller and our generated specifications.
> > >
> > > The reproducers and config for the kernel are attached.
> > >
> > > ```
> > > BUG: memory leak
> > > unreferenced object 0xffff888018322900 (size 240):
> > > comm "syz-executor115", pid 8030, jiffies 4294985782 (age 11.650s)
> > > hex dump (first 32 bytes):
> > > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...............
> > > 00 00 00 00 00 00 00 00 40 5a 8b 14 80 88 ff ff ........@Z.....
> > > backtrace:
> > > [<ffffffff81625419>] kmemleak_alloc_recursive
> > > include/linux/kmemleak.h:42 [inline]
> > > [<ffffffff81625419>] slab_post_alloc_hook mm/slab.h:766 [inline]
> > > [<ffffffff81625419>] slab_alloc_node mm/slub.c:3478 [inline]
> > > [<ffffffff81625419>] kmem_cache_alloc_node+0x2e9/0x440 mm/slub.c:3523
> > > [<ffffffff83e747e7>] __alloc_skb+0x1f7/0x220 net/core/skbuff.c:641
> > > [<ffffffff83e6a06b>] alloc_skb include/linux/skbuff.h:1286 [inline]
> > > [<ffffffff83e6a06b>] sock_omalloc+0x5b/0xa0 net/core/sock.c:2657
> > > [<ffffffff83e7d702>] msg_zerocopy_alloc net/core/skbuff.c:1552 [inline]
> > > [<ffffffff83e7d702>] msg_zerocopy_realloc+0xf2/0x340 net/core/skbuff.c:1628
> > > [<ffffffff8430d3a2>] __ip6_append_data.isra.0+0x1432/0x1e50
> > > net/ipv6/ip6_output.c:1517
> > > [<ffffffff8430decf>] ip6_append_data+0x10f/0x2e0 net/ipv6/ip6_output.c:1832
> > > [<ffffffff84352bd1>] udpv6_sendmsg+0x851/0x1690 net/ipv6/udp.c:1602
> > > [<ffffffff84305b39>] inet6_sendmsg+0x49/0x70 net/ipv6/af_inet6.c:657
> > > [<ffffffff83e5e954>] sock_sendmsg_nosec net/socket.c:730 [inline]
> > > [<ffffffff83e5e954>] __sock_sendmsg+0x54/0xb0 net/socket.c:745
> > > [<ffffffff83e61982>] __sys_sendto+0x172/0x220 net/socket.c:2194
> > > [<ffffffff83e61a58>] __do_sys_sendto net/socket.c:2206 [inline]
> > > [<ffffffff83e61a58>] __se_sys_sendto net/socket.c:2202 [inline]
> > > [<ffffffff83e61a58>] __x64_sys_sendto+0x28/0x30 net/socket.c:2202
> > > [<ffffffff84ae676f>] do_syscall_x64 arch/x86/entry/common.c:51 [inline]
> > > [<ffffffff84ae676f>] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
> > > [<ffffffff84c0008b>] entry_SYSCALL_64_after_hwframe+0x63/0x6b
> > >
> > > BUG: memory leak
> > > unreferenced object 0xffff888014a58280 (size 640):
> > > comm "syz-executor115", pid 8030, jiffies 4294985782 (age 11.650s)
> > > hex dump (first 32 bytes):
> > > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...............
> > > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...............
> > > backtrace:
> > > [<ffffffff81625419>] kmemleak_alloc_recursive
> > > include/linux/kmemleak.h:42 [inline]
> > > [<ffffffff81625419>] slab_post_alloc_hook mm/slab.h:766 [inline]
> > > [<ffffffff81625419>] slab_alloc_node mm/slub.c:3478 [inline]
> > > [<ffffffff81625419>] kmem_cache_alloc_node+0x2e9/0x440 mm/slub.c:3523
> > > [<ffffffff83e70b20>] kmalloc_reserve+0xe0/0x170 net/core/skbuffc:560
> > > [<ffffffff83e746c1>] __alloc_skb+0xd1/0x220 net/core/skbuff.c:651
> > > [<ffffffff83e6a06b>] alloc_skb include/linux/skbuff.h:1286 [inline]
> > > [<ffffffff83e6a06b>] sock_omalloc+0x5b/0xa0 net/core/sock.c:2657
> > > [<ffffffff83e7d702>] msg_zerocopy_alloc net/core/skbuff.c:1552 [inline]
> > > [<ffffffff83e7d702>] msg_zerocopy_realloc+0xf2/0x340 net/core/skbuff.c:1628
> > > [<ffffffff8430d3a2>] __ip6_append_data.isra.0+0x1432/0x1e50
> > > net/ipv6/ip6_output.c:1517
> > > [<ffffffff8430decf>] ip6_append_data+0x10f/0x2e0 net/ipv6/ip6_output.c:1832
> > > [<ffffffff84352bd1>] udpv6_sendmsg+0x851/0x1690 net/ipv6/udp.c:1602
> > > [<ffffffff84305b39>] inet6_sendmsg+0x49/0x70 net/ipv6/af_inet6.c:657
> > > [<ffffffff83e5e954>] sock_sendmsg_nosec net/socket.c:730 [inline]
> > > [<ffffffff83e5e954>] __sock_sendmsg+0x54/0xb0 net/socket.c:745
> > > [<ffffffff83e61982>] __sys_sendto+0x172/0x220 net/socket.c:2194
> > > [<ffffffff83e61a58>] __do_sys_sendto net/socket.c:2206 [inline]
> > > [<ffffffff83e61a58>] __se_sys_sendto net/socket.c:2202 [inline]
> > > [<ffffffff83e61a58>] __x64_sys_sendto+0x28/0x30 net/socket.c:2202
> > > [<ffffffff84ae676f>] do_syscall_x64 arch/x86/entry/common.c:51 [inline]
> > > [<ffffffff84ae676f>] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
> > > [<ffffffff84c0008b>] entry_SYSCALL_64_after_hwframe+0x63/0x6b
> > >
> > > Syzkaller reproducer:
> > > # {Threaded:true Repeat:true RepeatTimes:0 Procs:1 Slowdown:1 Sandbox:
> > > SandboxArg:0 Leak:true NetInjection:false NetDevices:false
> > > NetReset:false Cgroups:false BinfmtMisc:false CloseFDs:false
> > > KCSAN:false DevlinkPCI:false NicVF:false USB:false VhciInjection:false
> > > Wifi:false IEEE802154:false Sysctl:false Swap:false UseTmpDir:false
> > > HandleSegv:false Repro:false Trace:false LegacyOptions:{Collide:false
> > > Fault:false FaultCall:0 FaultNth:0}}
> > > r0 = socket$KGPT_inet6_udp(0xa, 0x2, 0x11)
> > > setsockopt$sock_int(r0, 0x1, 0x3c, &(0x7f0000000000)=0x1, 0x4)
> > > sendto$KGPT_inet6_dgram_ops(r0, 0x0, 0x0, 0x24008006,
> > > &(0x7f0000000180)={0xa, 0x4e20, 0x0, @loopback, 0x6}, 0x1c) (async)
> > > sendto$KGPT_inet6_dgram_ops(r0, &(0x7f00000015c0)="98", 0x1,
> > > 0x4000040, &(0x7f0000000040)={0xa, 0x4e24, 0x0, @empty, 0x1}, 0x1c)
> > > (rerun: 64)
> > > ```
> >
> > TL;DR: I haven't reproduced or found a bug through analysis yet.
> >
> > A race, as the program requires threaded mode.
> >
> > Short program:
> >
> > socket(AF_INET6, SOCK_DGRAM, IPPROTO_UDP) = 3
> >
> > setsockopt(3, SOL_SOCKET, SO_ZEROCOPY, [1], 4) = 0
> >
> > for (i = 0; i < UDP_MAX_SEGMENTS /* 64 */; i++)
> > sendto(3, "\230", 1, MSG_DONTWAIT|MSG_ZEROCOPY,
> > {sa_family=AF_INET6, sin6_port=htons(20004), sin6_flowinfo=htonl(0),
> > inet_pton(AF_INET6, "::", &sin6_addr), sin6_scope_id=1}, 28) = 1
> >
> > sendto(3, NULL, 0, MSG_PEEK|MSG_DONTROUTE|MSG_MORE|MSG_ZEROCOPY|MSG_FASTOPEN,
> > {sa_family=AF_INET6, sin6_port=htons(20000), sin6_flow info=htonl(0),
> > inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=6}, 28) = 0
> >
> > Where each of the four blocks run in separate threads. In effect the two sendto
> > calls race.
> >
> > They differ in their sendto destination addresses. But on corked sockets
> > only the addr argument of the first call is used.
> >
> > Only the second call calls MSG_MORE, so can setup a udp packet that persists
> > between the calls. Kind of odd to even allocate an skb if length is 0B?
> >
> > The MSG_ZEROCOPY path is only taken for non-zero length, so can be ignored on
> > the 0B call. In __ip6_append_data:
> >
> > if ((flags & MSG_ZEROCOPY) && length) {
> >
> > So one uarg reference is taken on the second call. For corked udp sockets, the
> > total refcount on uarg also remains 1 regardless of the number of MSG_MORE
> > send calls, each of which calls msg_zerocopy_realloc.
> >
> > So an skb gets created and sent, using two calls (one MSG_MORE, one not). Both
> > calls return without error.
> >
> > Question is where the uarg can get lost or acquire an extra reference.

Still no root cause. But I ran some variations to narrow down the possibilities.

Besides the reported ubuf_info leak, kmemleak also reports an sk_alloc leak (and
associated apparmor_sk_alloc_security leak).

Most MSG_.. flags passed are irrelevant, as can be expected as many are ignored
on UDP rx.

Required are MSG_ZEROCOPY on both sendto calls, MSG_MORE on the 0B call and
MSG_DONTWAIT on the 1B call. Remove any of these as the issue no longer
reproduces.

The leak happens on a sendto 1B with MSG_ZEROCOPY | MSG_DONTWAIT. No prior skb
exists. So this is not an append to the sendto 0B with MSG_MORE.

The underlying issue is a race and it is very brief. Adding even a little
instrumentation to store some state in ubuf_info (e.g., whether this is a
new skb or an append to a 0B payload skb), makes kmemleak stop reporting the
ubuf_info leak. Interestingly, it does still report the sk_alloc leak.