Re: [syzbot] [mm?] BUG: unable to handle kernel paging request in copy_from_kernel_nofault (2)

From: Andrii Nakryiko
Date: Fri Apr 05 2024 - 13:50:52 EST


On Fri, Apr 5, 2024 at 9:30 AM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Fri, Apr 5, 2024 at 4:36 AM Russell King (Oracle)
> <linux@xxxxxxxxxxxxxxx> wrote:
> >
> > On Fri, Apr 05, 2024 at 12:02:36PM +0100, Mark Rutland wrote:
> > > On Thu, Apr 04, 2024 at 03:57:04PM -0700, Alexei Starovoitov wrote:
> > > > On Wed, Apr 3, 2024 at 6:56 PM Andrew Morton <akpm@linux-foundationorg> wrote:
> > > > >
> > > > > On Mon, 01 Apr 2024 22:19:25 -0700 syzbot <syzbot+186522670e6722692d86@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > > Hello,
> > > > >
> > > > > Thanks. Cc: bpf@xxxxxxxxxxxxxxx
> > > >
> > > > I suspect the issue is not on bpf side.
> > > > Looks like the bug is somewhere in arm32 bits.
> > > > copy_from_kernel_nofault() is called from lots of places.
> > > > bpf is just one user that is easy for syzbot to fuzz.
> > > > Interestingly arm defines copy_from_kernel_nofault_allowed()
> > > > that should have filtered out user addresses.
> > > > In this case ffffffe9 is probably a kernel address?
> > >
> > > It's at the end of the kernel range, and it's ERR_PTR(-EINVAL).
> > >
> > > 0xffffffe9 is -0x16, which is -22, which is -EINVAL.
> > >
> > > > But the kernel is doing a write?
> > > > Which makes no sense, since copy_from_kernel_nofault is probe reading.
> > >
> > > It makes perfect sense; the read from 'src' happened, then the kernel tries to
> > > write the result to 'dst', and that aligns with the disassembly in the report
> > > below, which I beleive is:
> > >
> > > 8: e4942000 ldr r2, [r4], #0 <-- Read of 'src', fault fixup is elsewhere
> > > c: e3530000 cmp r3, #0
> > > * 10: e5852000 str r2, [r5] <-- Write to 'dst'
> > >
> > > As above, it looks like 'dst' is ERR_PTR(-EINVAL).
> > >
> > > Are you certain that BPF is passing a sane value for 'dst'? Where does that
> > > come from in the first place?
> >
> > It looks to me like it gets passed in from the BPF program, and the
> > "type" for the argument is set to ARG_PTR_TO_UNINIT_MEM. What that
> > means for validation purposes, I've no idea, I'm not a BPF hacker.
> >
> > Obviously, if BPF is allowing copy_from_kernel_nofault() to be passed
> > an arbitary destination address, that would be a huge security hole.
>
> If that's the case that's indeed a giant security hole,
> but I doubt it. We would be crashing other archs as well.
> I cannot really tell whether arm32 JIT is on.
> If it is, it's likely a bug there.
> Puranjay,
> could you please take a look.
>

I dumped the BPF program that repro.c is loading, it works on x86-64
and there is nothing special there. We are probe-reading 5 bytes from
somewhere into the stack. Everything is unaligned here, but stays
within a well-defined memory slot.

Note the r3 = (s8)r1, that's a new-ish thing, maybe bug is somewhere
there (but then it would be JIT, not verifier itself)

0: (7a) *(u64 *)(r10 -8) = 896542069
1: (bf) r1 = r10
2: (07) r1 += -7
3: (b7) r2 = 5
4: (bf) r3 = (s8)r1
5: (85) call bpf_probe_read_kernel#-72390
6: (b7) r0 = 0
7: (95) exit