Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write in setup_arch

From: Samuel Holland
Date: Tue Sep 03 2024 - 12:44:13 EST


On 2024-09-03 11:05 AM, Marc Zyngier wrote:
> On Tue, 03 Sep 2024 16:39:28 +0100,
> Alexander Potapenko <glider@xxxxxxxxxx> wrote:
>>
>> On Mon, Sep 2, 2024 at 12:03 PM 'Aleksandr Nogikh' via kasan-dev
>> <kasan-dev@xxxxxxxxxxxxxxxx> wrote:
>>>
>>> +kasan-dev
>>>
>>> On Sat, Aug 31, 2024 at 7:53 PM 'Marc Zyngier' via syzkaller-bugs
>>> <syzkaller-bugs@xxxxxxxxxxxxxxxx> wrote:
>>>>
>>>> On Fri, 30 Aug 2024 10:52:54 +0100,
>>>> Will Deacon <will@xxxxxxxxxx> wrote:
>>>>>
>>>>> On Fri, Aug 30, 2024 at 01:35:24AM -0700, syzbot wrote:
>>>>>> Hello,
>>>>>>
>>>>>> syzbot found the following issue on:
>>>>>>
>>>>>> HEAD commit: 33faa93bc856 Merge branch kvmarm-master/next into kvmarm-m..
>>>>>> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fuzzme
>>>>>
>>>>> +Marc, as this is his branch.
>>>>>
>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1398420b980000
>>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=2b7b31c9aa1397ca
>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=908886656a02769af987
>>>>>> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
>>>>>> userspace arch: arm64
>>>>
>>>> As it turns out, this isn't specific to this branch. I can reproduce
>>>> it with this config on a vanilla 6.10 as a KVM guest. Even worse,
>>>> compiling with clang results in an unbootable kernel (without any
>>>> output at all).
>>>>
>>>> Mind you, the binary is absolutely massive (130MB with gcc, 156MB with
>>>> clang), and I wouldn't be surprised if we were hitting some kind of
>>>> odd limit.
>>>>
>>>>>>
>>>>>> Downloadable assets:
>>>>>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/384ffdcca292/non_bootable_disk-33faa93b.raw.xz
>>>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/9093742fcee9/vmlinux-33faa93b.xz
>>>>>> kernel image: https://storage.googleapis.com/syzbot-assets/b1f599907931/Image-33faa93b.gz.xz
>>>>>>
>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>>> Reported-by: syzbot+908886656a02769af987@xxxxxxxxxxxxxxxxxxxxxxxxx
>>>>>>
>>>>>> Booting Linux on physical CPU 0x0000000000 [0x000f0510]
>>>>>> Linux version 6.11.0-rc5-syzkaller-g33faa93bc856 (syzkaller@syzkaller) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #0 SMP PREEMPT now
>>>>>> random: crng init done
>>>>>> Machine model: linux,dummy-virt
>>>>>> efi: UEFI not found.
>>>>>> NUMA: No NUMA configuration found
>>>>>> NUMA: Faking a node at [mem 0x0000000040000000-0x00000000bfffffff]
>>>>>> NUMA: NODE_DATA [mem 0xbfc1d340-0xbfc20fff]
>>>>>> Zone ranges:
>>>>>> DMA [mem 0x0000000040000000-0x00000000bfffffff]
>>>>>> DMA32 empty
>>>>>> Normal empty
>>>>>> Device empty
>>>>>> Movable zone start for each node
>>>>>> Early memory node ranges
>>>>>> node 0: [mem 0x0000000040000000-0x00000000bfffffff]
>>>>>> Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff]
>>>>>> cma: Reserved 32 MiB at 0x00000000bba00000 on node -1
>>>>>> psci: probing for conduit method from DT.
>>>>>> psci: PSCIv1.1 detected in firmware.
>>>>>> psci: Using standard PSCI v0.2 function IDs
>>>>>> psci: Trusted OS migration not required
>>>>>> psci: SMC Calling Convention v1.0
>>>>>> ==================================================================
>>>>>> BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
>>>>>> BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
>>>>>> Write of size 4 at addr 03ff800086867e00 by task swapper/0
>>>>>> Pointer tag: [03], memory tag: [fe]
>>>>>>
>>>>>> CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc5-syzkaller-g33faa93bc856 #0
>>>>>> Hardware name: linux,dummy-virt (DT)
>>>>>> Call trace:
>>>>>> dump_backtrace+0x204/0x3b8 arch/arm64/kernel/stacktrace.c:317
>>>>>> show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
>>>>>> __dump_stack lib/dump_stack.c:93 [inline]
>>>>>> dump_stack_lvl+0x260/0x3b4 lib/dump_stack.c:119
>>>>>> print_address_description mm/kasan/report.c:377 [inline]
>>>>>> print_report+0x118/0x5ac mm/kasan/report.c:488
>>>>>> kasan_report+0xc8/0x108 mm/kasan/report.c:601
>>>>>> kasan_check_range+0x94/0xb8 mm/kasan/sw_tags.c:84
>>>>>> __hwasan_store4_noabort+0x20/0x2c mm/kasan/sw_tags.c:149
>>>>>> smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
>>>>>> setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
>>>>>> start_kernel+0xe0/0xff0 init/main.c:926
>>>>>> __primary_switched+0x84/0x8c arch/arm64/kernel/head.S:243
>>>>>>
>>>>>> The buggy address belongs to stack of task swapper/0
>>>>>>
>>>>>> Memory state around the buggy address:
>>>>>> ffff800086867c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>>>> ffff800086867d00: 00 fe fe 00 00 00 fe fe fe fe fe fe fe fe fe fe
>>>>>>> ffff800086867e00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
>>>>>> ^
>>>>>> ffff800086867f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
>>>>>> ffff800086868000: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
>>>>>> ==================================================================
>>>>>
>>>>> I can't spot the issue here. We have a couple of fixed-length
>>>>> (4 element) arrays on the stack and they're indexed by a simple loop
>>>>> counter that runs from 0-3.
>>>>
>>>> Having trimmed the config to the extreme, I can only trigger the
>>>> warning with CONFIG_KASAN_SW_TAGS (CONFIG_KASAN_GENERIC does not
>>>> scream). Same thing if I use gcc 14.2.0.
>>>>
>>>> However, compiling with clang 14 (Debian clang version 14.0.6) does
>>>> *not* result in a screaming kernel, even with KASAN_SW_TAGS.
>>>>
>>>> So I can see two possibilities here:
>>>>
>>>> - either gcc is incompatible with KASAN_SW_TAGS and the generic
>>>> version is the only one that works
>>>>
>>>> - or we have a compiler bug on our hands.
>>>>
>>>> Frankly, I can't believe the later, as the code is so daft that I
>>>> can't imagine gcc getting it *that* wrong.
>>>>
>>>> Who knows enough about KASAN to dig into this?
>>
>> This looks related to Samuel's "arm64: Fix KASAN random tag seed
>> initialization" patch that landed in August.
>
> f75c235565f9 arm64: Fix KASAN random tag seed initialization
>
> $ git describe --contains f75c235565f9 --match=v\*
> v6.11-rc4~15^2
>
> So while this is in -rc4, -rc6 still has the same issue (with GCC --
> clang is OK).

I wouldn't expect it to be related to my patch. smp_build_mpidr_hash() gets
called before kasan_init_sw_tags() both before and after applying my patch.

Since the variable in question is a stack variable, the random tag is generated
by GCC, not the kernel function.

Since smp_build_mpidr_hash() is inlined into setup_arch(), which also calls
kasan_init(), maybe the issue is that GCC tries to allocate the local variable
and write the tag to shadow memory before kasan_init() actually sets up the
shadow memory?

Regards,
Samuel

>> I am a bit surprised the bug is reported before the
>> "KernelAddressSanitizer initialized" banner is printed - I thought we
>> shouldn't be reporting anything until the tool is fully initialized.
>
> Specially if this can report false positives...
>
> Thanks,
>
> M.
>