Re: BUG: unable to handle kernel NULL pointer dereference in sock_def_readable

From: Kuniyuki Iwashima
Date: Wed Aug 28 2024 - 21:18:36 EST


From: Xingyu Li <xli399@xxxxxxx>
Date: Wed, 28 Aug 2024 16:38:59 -0700
> Hi,
>
> We found a bug in Linux 6.10 using syzkaller. It is possibly a null
> pointer dereference bug.
> The bug report is as follows, but unfortunately there is no generated
> syzkaller reproducer.

quoting Eric's words:

---8<---
I would ask you to stop sending these reports, we already have syzbot
with a more complete infrastructure.
---8<---
https://lore.kernel.org/netdev/CANn89iK6rq0XWO5-R5CzA5YAv2ygaTA==EVh+O74VHGDBNqUoA@xxxxxxxxxxxxxx/

(unless you have a repro that syzbot doesn't have or you are confident
that this is true positive)


>
> Bug report:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 0 P4D 0
> Oops: Oops: 0010 [#1] PREEMPT SMP KASAN PTI
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.10.0 #13
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> RSP: 0018:ffffc90000006af8 EFLAGS: 00010046
> RAX: 1ffff92001572f0a RBX: 0000000000000000 RCX: 00000000000000c3
> RDX: 0000000000000010 RSI: 0000000000000001 RDI: ffffc9000ab97840
> RBP: 0000000000000001 R08: 0000000000000003 R09: fffff52000000d3c
> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffc9000ab97850
> R13: 0000000000000000 R14: ffffc9000ab97840 R15: ffff88802dfb3680
> FS: 0000000000000000(0000) GS:ffff888063a00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000000d932000 CR4: 0000000000350ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <IRQ>
> __wake_up_common kernel/sched/wait.c:89 [inline]
> __wake_up_common_lock+0x134/0x1e0 kernel/sched/wait.c:106
> sock_def_readable+0x167/0x380 net/core/sock.c:3353

This seems to be caused due to memory corruption.
skwq_has_sleeper() has NULL check.

Recently I saw some reports similar to what you posted and that seem
unlikely to happen without such an issue in another place.