Re: [syzbot] upstream test error: KASAN: invalid-access Read in __entry_tramp_text_end
From: Dmitry Vyukov
Date: Mon Sep 27 2021 - 10:27:47 EST
On Tue, 21 Sept 2021 at 18:51, Mark Rutland <mark.rutland@xxxxxxx> wrote:
>
> Hi Dmitry,
>
> The good news is that the bad unwind is a known issue, the bad news is
> that we don't currently have a way to fix it (and I'm planning to talk
> about this at the LPC "objtool on arm64" talk this Friday).
>
> More info below: the gist is we can produce spurious entries at an
> exception boundary, but shouldn't miss a legitimate value, and there's a
> plan to make it easier to spot when entries are not legitimate.
>
> On Fri, Sep 17, 2021 at 05:03:48PM +0200, Dmitry Vyukov wrote:
> > > Call trace:
> > > dump_backtrace+0x0/0x1ac arch/arm64/kernel/stacktrace.c:76
> > > show_stack+0x18/0x24 arch/arm64/kernel/stacktrace.c:215
> > > __dump_stack lib/dump_stack.c:88 [inline]
> > > dump_stack_lvl+0x68/0x84 lib/dump_stack.c:105
> > > print_address_description+0x7c/0x2b4 mm/kasan/report.c:256
> > > __kasan_report mm/kasan/report.c:442 [inline]
> > > kasan_report+0x134/0x380 mm/kasan/report.c:459
> > > __do_kernel_fault+0x128/0x1bc arch/arm64/mm/fault.c:317
> > > do_bad_area arch/arm64/mm/fault.c:466 [inline]
> > > do_tag_check_fault+0x74/0x90 arch/arm64/mm/fault.c:737
> > > do_mem_abort+0x44/0xb4 arch/arm64/mm/fault.c:813
> > > el1_abort+0x40/0x60 arch/arm64/kernel/entry-common.c:357
> > > el1h_64_sync_handler+0xb0/0xd0 arch/arm64/kernel/entry-common.c:408
> > > el1h_64_sync+0x78/0x7c arch/arm64/kernel/entry.S:567
> > > __entry_tramp_text_end+0xdfc/0x3000
> >
> > /\/\/\/\/\/\/\
> >
> > This is broken unwind on arm64. d_lookup statically calls __d_lookup,
> > not __entry_tramp_text_end (which is not even a function).
> > See the following thread for some debugging details:
> > https://lore.kernel.org/lkml/CACT4Y+ZByJ71QfYHTByWaeCqZFxYfp8W8oyrK0baNaSJMDzoUw@xxxxxxxxxxxxxx/
>
> The problem here is that our calling convention (AAPCS64) only allows us
> to reliably unwind at function call boundaries, where the state of both
> the Link Register (LR/x30) and Frame Pointer (FP/x29) are well-defined.
> Within a function, we don't know whether to start unwinding from the LR
> or FP, and we currently start from the LR, which can produce spurious
> entries (but ensures we don't miss anything legitimte).
>
> In the short term, I have a plan to make the unwinder indicate when an
> entry might not be legitimate, with the usual stackdump code printing an
> indicator like '?' on x86.
>
> In the longer term, we might be doing things with objtool or asking for
> some toolchain help such that we can do better in these cases.
Hi Mark,
Any updates after the LPC session?
If the dumper adds " ? ", then syzkaller will strip these frames
(required for x86).
However, I am worried that we can remove the true top frame then and
attribute crashes to wrong frames again?
Some naive questions:
1. Shouldn't the top frame for synchronous faults be in the PC/IP
register (I would assume LR/FP contains the caller of the current
frame)?
2. How __entry_tramp_text_end, which is not a function, even ended up
in LR? shouldn't it always contain some code pointer (even if stale)?
3. Isn't there already something in the debug info to solve this
problem? Userspace programs don't use objtool, but I assume that can
print crash stacks somehow (?).