Re: [syzbot] [net?] BUG: unable to handle kernel paging request in net_generic

From: James Chapman
Date: Mon Jul 29 2024 - 07:50:18 EST


On 26/07/2024 16:02, Jakub Kicinski wrote:
CC: James [L2TP]

On Thu, 25 Jul 2024 03:37:24 -0700 syzbot wrote:
Hello,

syzbot found the following issue on:

HEAD commit: c912bf709078 Merge remote-tracking branches 'origin/arm64-..
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=1625a15e980000
kernel config: https://syzkaller.appspot.com/x/.config?x=79a49b0b9ffd6585
dashboard link: https://syzkaller.appspot.com/bug?extid=6acef9e0a4d1f46c83d4
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/fea69a9d153c/disk-c912bf70.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/be06762a72ef/vmlinux-c912bf70.xz
kernel image: https://storage.googleapis.com/syzbot-assets/6c8e58b4215d/Image-c912bf70.gz.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+6acef9e0a4d1f46c83d4@xxxxxxxxxxxxxxxxxxxxxxxxx

Unable to handle kernel paging request at virtual address dfff800000000257
KASAN: probably user-memory-access in range [0x00000000000012b8-0x00000000000012bf]
>> ...
Call trace:
net_generic+0xd0/0x250 include/net/netns/generic.h:46
l2tp_pernet net/l2tp/l2tp_core.c:125 [inline]
l2tp_tunnel_get+0x90/0x464 net/l2tp/l2tp_core.c:207
l2tp_udp_recv_core net/l2tp/l2tp_core.c:852 [inline]
l2tp_udp_encap_recv+0x314/0xb3c net/l2tp/l2tp_core.c:933
udpv6_queue_rcv_one_skb+0x1870/0x1ad4 net/ipv6/udp.c:727
udpv6_queue_rcv_skb+0x3bc/0x574 net/ipv6/udp.c:789
udp6_unicast_rcv_skb+0x1cc/0x320 net/ipv6/udp.c:929
__udp6_lib_rcv+0xbcc/0x1330 net/ipv6/udp.c:1018
udpv6_rcv+0x88/0x9c net/ipv6/udp.c:1133
ip6_protocol_deliver_rcu+0x988/0x12a4 net/ipv6/ip6_input.c:438
ip6_input_finish+0x164/0x298 net/ipv6/ip6_input.c:483
...

This crash is the result of a call to net_generic() being unable to dereference net when handling a received l2tpv2 packet.

The stack frame indicates that l2tp_udp_recv_core finds that the packet's tunnel_id does not match the tunnel pointer derived from sk_user_data of the receiving socket. This can happen when more than one socket shares the same 5-tuple address. When a tunnel ID mismatch is detected, l2tp looks up the tunnel using the ID from the packet. It is this lookup which segfaults in net_generic() when l2tp tries to access its per-net tunnel list.

The code implicated by the crash, which added support for aliased sockets, is no longer in linux-net or net-next. l2tp no longer looks up tunnels in the datapath; instead it looks up sessions without finding the parent tunnel first. The commits are:

* support for aliased sockets was added in 628bc3e5a1be ("l2tp: Support several sockets with same IP/port quadruple") May 2024.

* l2tp's receive path was refactored in ff6a2ac23cb0 ("l2tp: refactor udp recv to lookup to not use sk_user_data") June 2024.

Is 628bc3e5a1be in any LTS or stable kernel? I didn't find it in linux-stable.git

A possible fix is attached.

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
From 362b6725778bd3eb7a73e36a75e92e2b09ec7541 Mon Sep 17 00:00:00 2001
From: James Chapman <jchapman@xxxxxxxxxxx>
Date: Mon, 29 Jul 2024 09:49:07 +0100
Subject: [PATCH] l2tp: fix tunnel init / UDP socket receive race

syzbot exposes a race during tunnel init when the tunnel's UDP socket
is made ready to receive traffic before the tunnel's l2tp_net pointer
is set. This can result in a segfault in net_generic() when
l2tp_tunnel_get tries to access its per-net list.

Unable to handle kernel paging request at virtual address dfff800000000257
KASAN: probably user-memory-access in range [0x00000000000012b8-0x00000000000012bf]
Mem abort info:
ESR = 0x0000000096000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
Data abort info:
ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[dfff800000000257] address between user and kernel address ranges
Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 6969 Comm: syz.2.105 Not tainted 6.10.0-rc7-syzkaller-gc912bf709078 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : net_generic+0xd0/0x250 include/net/netns/generic.h:46
lr : rcu_read_lock include/linux/rcupdate.h:782 [inline]
lr : net_generic+0x54/0x250 include/net/netns/generic.h:45
sp : ffff8000a6c86c10
x29: ffff8000a6c86c10 x28: dfff800000000000 x27: 0000000000000802
x26: 0000000000000002 x25: 1ffff00014d90d88 x24: dfff800000000000
x23: ffff0000ca3fbd70 x22: ffff8000a6c86c40 x21: dfff800000000000
x20: 00000000000012b8 x19: 000000000000004e x18: 1ffff00014d90cfe
x17: 000000000003099a x16: ffff80008054bde8 x15: 0000000000000001
x14: ffff80008f100568 x13: dfff800000000000 x12: 00000000af8628cd
x11: 0000000068a0e22d x10: 0000000000ff0100 x9 : 0000000000000000
x8 : 0000000000000257 x7 : ffff80008a4326a8 x6 : 0000000000000000
x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000002
x2 : 0000000000000008 x1 : ffff80008b681f20 x0 : 0000000000000001
Call trace:
net_generic+0xd0/0x250 include/net/netns/generic.h:46
l2tp_pernet net/l2tp/l2tp_core.c:125 [inline]
l2tp_tunnel_get+0x90/0x464 net/l2tp/l2tp_core.c:207
l2tp_udp_recv_core net/l2tp/l2tp_core.c:852 [inline]
l2tp_udp_encap_recv+0x314/0xb3c net/l2tp/l2tp_core.c:933
udpv6_queue_rcv_one_skb+0x1870/0x1ad4 net/ipv6/udp.c:727
udpv6_queue_rcv_skb+0x3bc/0x574 net/ipv6/udp.c:789
udp6_unicast_rcv_skb+0x1cc/0x320 net/ipv6/udp.c:929
__udp6_lib_rcv+0xbcc/0x1330 net/ipv6/udp.c:1018
udpv6_rcv+0x88/0x9c net/ipv6/udp.c:1133
ip6_protocol_deliver_rcu+0x988/0x12a4 net/ipv6/ip6_input.c:438
ip6_input_finish+0x164/0x298 net/ipv6/ip6_input.c:483
NF_HOOK+0x328/0x3d4 include/linux/netfilter.h:314
ip6_input+0x90/0xa8 net/ipv6/ip6_input.c:492
dst_input include/net/dst.h:460 [inline]
ip6_rcv_finish+0x1f0/0x21c net/ipv6/ip6_input.c:79
NF_HOOK+0x328/0x3d4 include/linux/netfilter.h:314
ipv6_rcv+0x9c/0xbc net/ipv6/ip6_input.c:310
__netif_receive_skb_one_core net/core/dev.c:5625 [inline]
__netif_receive_skb+0x18c/0x3c8 net/core/dev.c:5739
netif_receive_skb_internal net/core/dev.c:5825 [inline]
netif_receive_skb+0x1f0/0x93c net/core/dev.c:5885
tun_rx_batched+0x568/0x6e4
tun_get_user+0x260c/0x3978 drivers/net/tun.c:2002
tun_chr_write_iter+0xfc/0x204 drivers/net/tun.c:2048
new_sync_write fs/read_write.c:497 [inline]
vfs_write+0x8f8/0xc38 fs/read_write.c:590
ksys_write+0x15c/0x26c fs/read_write.c:643
__do_sys_write fs/read_write.c:655 [inline]
__se_sys_write fs/read_write.c:652 [inline]
__arm64_sys_write+0x7c/0x90 fs/read_write.c:652
__invoke_syscall arch/arm64/kernel/syscall.c:34 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:48
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:131
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:150
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
Code: d2d00015 f2fbfff5 8b080294 d343fe88 (38756908)
---[ end trace 0000000000000000 ]---
----------------
Code disassembly (best guess):
0: d2d00015 mov x21, #0x800000000000 // #140737488355328
4: f2fbfff5 movk x21, #0xdfff, lsl #48
8: 8b080294 add x20, x20, x8
c: d343fe88 lsr x8, x20, #3
* 10: 38756908 ldrb w8, [x8, x21] <-- trapping instruction

Reported-by: syzbot+6acef9e0a4d1f46c83d4@xxxxxxxxxxxxxxxxxxxxxxxxx
Fixes: 628bc3e5a1be ("l2tp: Support several sockets with same IP/port quadruple")
Signed-off-by: James Chapman <jchapman@xxxxxxxxxxx>
---
net/l2tp/l2tp_core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 88a34db265d8..c5a6cfcf1b3f 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1523,6 +1523,8 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
goto err;
}

+ tunnel->l2tp_net = net;
+
sk = sock->sk;
lock_sock(sk);
write_lock_bh(&sk->sk_callback_lock);
@@ -1551,7 +1553,6 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,

sock_hold(sk);
tunnel->sock = sk;
- tunnel->l2tp_net = net;

spin_lock_bh(&pn->l2tp_tunnel_idr_lock);
idr_replace(&pn->l2tp_tunnel_idr, tunnel, tunnel->tunnel_id);
--
2.34.1