Re: [syzbot] upstream test error: KASAN: invalid-access Read in __entry_tramp_text_end

From: Josh Poimboeuf
Date: Thu Sep 30 2021 - 15:26:47 EST


On Wed, Sep 29, 2021 at 01:43:23PM +0200, Peter Zijlstra wrote:
> On Wed, Sep 29, 2021 at 11:37:30AM +0100, Mark Rutland wrote:
>
> > > This is because _ASM_EXTABLE only generates data for another section.
> > > There doesn't need to be code continuity between these two asm
> > > statements.
> >
> > I think you've missed my point. It doesn't matter that the
> > asm_volatile_goto() doesn't contain code, and this is solely about the
> > *state* expected at entry/exit from each asm block being different.
>
> Urgh.. indeed :/

So much for that idea :-/

To fix the issue of the wrong .fixup code symbol names getting printed,
we could (as Mark suggested) add a '__fixup_text_start' symbol at the
start of the .fixup section. And then remove all other symbols in the
.fixup section.

For x86, that means removing the kvm_fastop_exception symbol and a few
others. That way it's all anonymous code, displayed by the kernel as
"__fixup_text_start+0x1234". Which isn't all that useful, but still
better than printing the wrong symbol.

But there's still a bigger problem: the function with the faulting
instruction doesn't get reported in the stack trace.

For example, in the up-thread bug report, __d_lookup() bug report
doesn't get printed, even though its anonymous .fixup code is running in
the context of the function and will be branching back to it shortly.

Even worse, this means livepatch is broken, because if for example
__d_lookup()'s .fixup code gets preempted, __d_lookup() can get skipped
by a reliable stack trace.

So we may need to get rid of .fixup altogether. Especially for arches
which support livepatch.

We can replace some of the custom .fixup handlers with generic handlers
like x86 does, which do the fixup work in exception context. This
generally works better for more generic work like putting an error code
in a certain register and resuming execution at the subsequent
instruction.

However a lot of the .fixup code is rather custom and doesn't
necessarily work well with that model.

In such cases we could just move the .fixup code into the function
(inline for older compilers; out-of-line for compilers that support
CC_HAS_ASM_GOTO_OUTPUT).

Alternatively we could convert each .fixup code fragment into a proper
function which returns to a specified resume point in the function, and
then have the exception handler emulate a call to it like we do with
int3_emulate_call().

--
Josh