Re: [syzbot] [mm?] BUG: unable to handle kernel paging request in copy_from_kernel_nofault

From: Thomas Gleixner
Date: Fri Dec 08 2023 - 16:01:22 EST


On Fri, Dec 08 2023 at 15:11, Jann Horn wrote:
> On Tue, Nov 21, 2023 at 6:13 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>> > BUG: unable to handle page fault for address: ffffffffff600000
>>
>> This is VSYSCALL_ADDR.
>>
>> So the real question is why the BPF program tries to copy from the
>> VSYSCALL page, which is not mapped.
>
> The linked syz repro is:
>
> r0 = bpf$PROG_LOAD(0x5, &(0x7f00000000c0)={0x11, 0xb,
> &(0x7f0000000180)=@framed={{}, [@printk={@integer, {}, {}, {}, {},
> {0x7, 0x0, 0xb, 0x3, 0x0, 0x0, 0xff600000}, {0x85, 0x0, 0x0, 0x71}}]},
> &(0x7f0000000200)='GPL\x00', 0x0, 0x0, 0x0, 0x0, 0x0, '\x00', 0x0,
> 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
> 0x90)
> bpf$BPF_RAW_TRACEPOINT_OPEN(0x11,
> &(0x7f0000000540)={&(0x7f0000000000)='kfree\x00', r0}, 0x10)
>
> So syzkaller generated a BPF tracing program. 0x85 is BPF_JMP |
> BPF_CALL, which is used to invoke BPF helpers; 0x71 is 113, which is
> the number of the probe_read_kernel helper, which basically takes
> arbitrary values as input and casts them to kernel pointers, and then
> probe-reads them. And before that is some kinda ALU op with 0xff600000
> as immediate.
>
> So it looks like the answer to that question is "the BPF program tries
> to copy from the VSYSCALL page because syzkaller decided to write BPF
> code that does specifically that, and the BPF helper let it do that".

Indeed.

> copy_from_kernel_nofault() does check
> copy_from_kernel_nofault_allowed() to make sure the pointer really is
> a kernel pointer, and the X86 version of that rejects anything in the
> userspace part of the address space. But it does not know about the
> vsyscall area.

That's cureable. Untested fix below.

Thanks for the explanation!

tglx

---
diff --git a/arch/x86/mm/maccess.c b/arch/x86/mm/maccess.c
index 6993f026adec..8e846833aa37 100644
--- a/arch/x86/mm/maccess.c
+++ b/arch/x86/mm/maccess.c
@@ -3,6 +3,8 @@
#include <linux/uaccess.h>
#include <linux/kernel.h>

+#include <uapi/asm/vsyscall.h>
+
#ifdef CONFIG_X86_64
bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
{
@@ -15,6 +17,9 @@ bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
if (vaddr < TASK_SIZE_MAX + PAGE_SIZE)
return false;

+ if ((vaddr & PAGE_MASK) == VSYSCALL_ADDR)
+ return false;
+
/*
* Allow everything during early boot before 'x86_virt_bits'
* is initialized. Needed for instruction decoding in early