Re: Crash on armv7-a using KASAN

From: Ard Biesheuvel
Date: Tue Oct 15 2024 - 10:22:56 EST


On Tue, 15 Oct 2024 at 16:00, Mark Rutland <mark.rutland@xxxxxxx> wrote:
>
> On Tue, Oct 15, 2024 at 03:51:02PM +0200, Linus Walleij wrote:
> > On Tue, Oct 15, 2024 at 12:28 PM Mark Rutland <mark.rutland@xxxxxxx> wrote:
> > > On Mon, Oct 14, 2024 at 03:19:49PM +0200, Clement LE GOFFIC wrote:
> >
> > > I think what's happening here is that when switching from prev to next
> > > in the scheduler, we switch to next's mm before we actually switch to
> > > next's register state, and there's a transient window where prev is
> > > executed using next's mm. AFAICT we don't map prev's KASAN stack shadow
> > > into next's mm anywhere, and so inlined KASAN_STACK checks recursively
> > > fault on this until we switch to the overflow stack.
> >
> > Oh my, that's pretty advanced. Well spotted!
> > So it has nothing to do with Ards commit, correlation does not
> > imply causation.
> >
> > > More details on that below.
> > >
> > > Linus, are you able to look into this?
> >
> > Of course, I'm trying to reproduce the bug.
> >
> > > > __dabt_svc from do_translation_fault+0x30/0x2b0
> > > > do_translation_fault from do_DataAbort+0x74/0x1dc
> > > > do_DataAbort from __dabt_svc+0x4c/0x80
> > > > Exception stack(0xac003ad8 to 0xac003b20)
> > > > 3ac0: ac003bc8
> > > > 00000005
> > > > 3ae0: ac003b88 74800779 7480078f ac003b88 7480078f ac003b88 00000005
> > > > 82412640
> > > > 3b00: ac003d20 ac003d54 00000051 ac003b28 80125c14 80125920 200f0193
> > > > ffffffff
> > > > __dabt_svc from do_translation_fault+0x30/0x2b0
> > > > do_translation_fault from do_DataAbort+0x74/0x1dc
> > > > do_DataAbort from __dabt_svc+0x4c/0x80
> > > > Exception stack(0xac003b88 to 0xac003bd0)
> > > > 3b80: ac003c78 00000805 ac003c38 7480078f 74800798
> > > > ac003c38
> > > > 3ba0: 74800798 ac003c38 00000805 82412640 ac003d20 ac003d54 00000051
> > > > ac003bd8
> > > > 3bc0: 80125c14 80125920 200f0193 ffffffff
> > > > __dabt_svc from do_translation_fault+0x30/0x2b0
> > > > do_translation_fault from do_DataAbort+0x74/0x1dc
> > > > do_DataAbort from __dabt_svc+0x4c/0x80
> > >
> > > The above frames are the same; whatever the kernel is accessing at
> > > do_translation_fault+0x30 is causing this to go recursive...
> > >
> > > I can reproduce this, pretty easily, with a similar enough trace, though
> > > faddr2line isn't happy to give me a line number.
> >
> > Did you reproduce it the same way with a few find /?
> >
> > I am trying to reproduce it and failing :/
> > (Using Torvald's HEAD)
> >
> > This is my config:
> >
> > CONFIG_HAVE_ARCH_KASAN=y
> > CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
> > CONFIG_CC_HAS_KASAN_GENERIC=y
> > CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
> > CONFIG_KASAN=y
> > CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX=y
> > CONFIG_KASAN_GENERIC=y
> > CONFIG_KASAN_OUTLINE=y
> > # CONFIG_KASAN_INLINE is not set
> > # CONFIG_KASAN_STACK is not set
> > # CONFIG_KASAN_VMALLOC is not set
> > # CONFIG_KASAN_EXTRA_INFO is not set
> >
> > Do you use more KASAN?
>
> I used a config per Clement's instructions, i.e.
>
> | [mark@lakrids:~/src/linux]% git checkout v6.12-rc3
> | HEAD is now at 8e929cb546ee4 Linux 6.12-rc3
> | [mark@lakrids:~/src/linux]% git clean -qfdx
> | [mark@lakrids:~/src/linux]% echo 'CONFIG_KASAN=y' > arch/arm/configs/fragment-kasan.config
> | [mark@lakrids:~/src/linux]% usekorg 14.2.0 make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- vexpress_defconfig fragment-kasan.config
> | HOSTCC scripts/basic/fixdep
> | HOSTCC scripts/kconfig/conf.o
> | HOSTCC scripts/kconfig/confdata.o
> | HOSTCC scripts/kconfig/expr.o
> | LEX scripts/kconfig/lexer.lex.c
> | YACC scripts/kconfig/parser.tab.[ch]
> | HOSTCC scripts/kconfig/lexer.lex.o
> | HOSTCC scripts/kconfig/menu.o
> | HOSTCC scripts/kconfig/parser.tab.o
> | HOSTCC scripts/kconfig/preprocess.o
> | HOSTCC scripts/kconfig/symbol.o
> | HOSTCC scripts/kconfig/util.o
> | HOSTLD scripts/kconfig/conf
> | #
> | # configuration written to .config
> | #
> | Using .config as base
> | Merging ./arch/arm/configs/fragment-kasan.config
> | Value of CONFIG_KASAN is redefined by fragment ./arch/arm/configs/fragment-kasan.config:
> | Previous value: # CONFIG_KASAN is not set
> | New value: CONFIG_KASAN=y
> |
> | #
> | # merged configuration written to .config (needs make)
> | #
> | #
> | # configuration written to .config
> | #
> | [mark@lakrids:~/src/linux]% grep KASAN .config
> | CONFIG_KASAN_SHADOW_OFFSET=0x5f000000
> | CONFIG_HAVE_ARCH_KASAN=y
> | CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
> | CONFIG_CC_HAS_KASAN_GENERIC=y
> | CONFIG_KASAN=y
> | CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX=y
> | CONFIG_KASAN_GENERIC=y
> | # CONFIG_KASAN_OUTLINE is not set
> | CONFIG_KASAN_INLINE=y
> | CONFIG_KASAN_STACK=y
> | CONFIG_KASAN_VMALLOC=y
> | # CONFIG_KASAN_MODULE_TEST is not set
> | # CONFIG_KASAN_EXTRA_INFO is not set
>
> IIUC you'll need CONFIG_KASAN_INLINE=y, CONFIG_KASAN_STACK=y, and
> CONFIG_KASAN_VMALLOC=y.
>
> With that, just booting and tapping a few keys was sufficient to trigger
> a crash using Debian the initrd Clement linked.
>
> [...]
>
> > > The relevant asm is:
> > (...)
> > > ... so we're using the new task's mm, but still executing in the context of the
> > > old task (and using its stack). I suspect the new task's mm doesn't have the
> > > old task's stack shadow mapped in, and AFAICT we don't map that in explicitly
> > > anywhere before we switch to the new mm.
> > >
> > > Linus, can you look into that?
> >
> > Yeah it looks like a spot-on identification of the problem, I can try to
> > think about how we could fix this if I can reproduce it, I keep trying
> > to provoke the crash :/
>
> It's a bit grotty -- AFAICT you'd either need to prefault in the
> specific part of the vmalloc space when switching tasks, or we'd need to
> preallocate all the shared vmalloc tables at the start of time so that
> they're always up-to-date.
>
> While we could disable KASAN_STACK, that's only going to mask the
> problem until this happens for any other vmalloc shadow...
>

Is the other vmalloc shadow not covered by the ordinary on-demand faulting?

When I implemented VMAP_STACK for ARM, I added an explicit load from
the new stack while still running from the old one (in __switch_to) so
that the ordinary faulting code can deal with it. Couldn't we do the
same for the vmalloc shadow of the new stack?