Re: [syzbot] upstream test error: KASAN: invalid-access Read in __entry_tramp_text_end

From: Dmitry Vyukov
Date: Mon Sep 27 2021 - 10:30:39 EST


On Mon, 27 Sept 2021 at 16:27, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>
> On Tue, 21 Sept 2021 at 18:51, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> >
> > Hi Dmitry,
> >
> > The good news is that the bad unwind is a known issue, the bad news is
> > that we don't currently have a way to fix it (and I'm planning to talk
> > about this at the LPC "objtool on arm64" talk this Friday).
> >
> > More info below: the gist is we can produce spurious entries at an
> > exception boundary, but shouldn't miss a legitimate value, and there's a
> > plan to make it easier to spot when entries are not legitimate.
> >
> > On Fri, Sep 17, 2021 at 05:03:48PM +0200, Dmitry Vyukov wrote:
> > > > Call trace:
> > > > dump_backtrace+0x0/0x1ac arch/arm64/kernel/stacktrace.c:76
> > > > show_stack+0x18/0x24 arch/arm64/kernel/stacktrace.c:215
> > > > __dump_stack lib/dump_stack.c:88 [inline]
> > > > dump_stack_lvl+0x68/0x84 lib/dump_stack.c:105
> > > > print_address_description+0x7c/0x2b4 mm/kasan/report.c:256
> > > > __kasan_report mm/kasan/report.c:442 [inline]
> > > > kasan_report+0x134/0x380 mm/kasan/report.c:459
> > > > __do_kernel_fault+0x128/0x1bc arch/arm64/mm/fault.c:317
> > > > do_bad_area arch/arm64/mm/fault.c:466 [inline]
> > > > do_tag_check_fault+0x74/0x90 arch/arm64/mm/fault.c:737
> > > > do_mem_abort+0x44/0xb4 arch/arm64/mm/fault.c:813
> > > > el1_abort+0x40/0x60 arch/arm64/kernel/entry-common.c:357
> > > > el1h_64_sync_handler+0xb0/0xd0 arch/arm64/kernel/entry-common.c:408
> > > > el1h_64_sync+0x78/0x7c arch/arm64/kernel/entry.S:567
> > > > __entry_tramp_text_end+0xdfc/0x3000
> > >
> > > /\/\/\/\/\/\/\
> > >
> > > This is broken unwind on arm64. d_lookup statically calls __d_lookup,
> > > not __entry_tramp_text_end (which is not even a function).
> > > See the following thread for some debugging details:
> > > https://lore.kernel.org/lkml/CACT4Y+ZByJ71QfYHTByWaeCqZFxYfp8W8oyrK0baNaSJMDzoUw@xxxxxxxxxxxxxx/
> >
> > The problem here is that our calling convention (AAPCS64) only allows us
> > to reliably unwind at function call boundaries, where the state of both
> > the Link Register (LR/x30) and Frame Pointer (FP/x29) are well-defined.
> > Within a function, we don't know whether to start unwinding from the LR
> > or FP, and we currently start from the LR, which can produce spurious
> > entries (but ensures we don't miss anything legitimte).
> >
> > In the short term, I have a plan to make the unwinder indicate when an
> > entry might not be legitimate, with the usual stackdump code printing an
> > indicator like '?' on x86.
> >
> > In the longer term, we might be doing things with objtool or asking for
> > some toolchain help such that we can do better in these cases.
>
> Hi Mark,
>
> Any updates after the LPC session?
>
> If the dumper adds " ? ", then syzkaller will strip these frames
> (required for x86).
> However, I am worried that we can remove the true top frame then and
> attribute crashes to wrong frames again?
>
> Some naive questions:
> 1. Shouldn't the top frame for synchronous faults be in the PC/IP
> register (I would assume LR/FP contains the caller of the current
> frame)?
> 2. How __entry_tramp_text_end, which is not a function, even ended up
> in LR? shouldn't it always contain some code pointer (even if stale)?
> 3. Isn't there already something in the debug info to solve this
> problem? Userspace programs don't use objtool, but I assume that can
> print crash stacks somehow (?).

+Will, Serban,

This ARM64 unwinder issue also means that all kernel MTE reports will
contain wrong top frame, right?