general protection fault, probably for non-canonical address in pick_next_task_fair()

From: Breno Leitao
Date: Thu Feb 29 2024 - 11:15:47 EST


I've been running some stress test using stress-ng with a kernel with some
debug options enabled, such as KASAN and friends (See the config below).

I saw it in rc4 and the decode instructions are a bit off (as it is here
also - search for mavabs in dmesg below and you will find something as `(bad)`,
so I though it was a machine issue. But now I see it again, and I am sharing
for awareness.

This is happening in upstream kernel against the following commit
d206a76d7d2726 ("Linux 6.8-rc6")

This is the exercpt that shows before the crash:

general protection fault, probably for non-canonical address 0xdffffc0000000014: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
KASAN: null-ptr-deref in range [0x00000000000000a0-0x00000000000000a7]

This is the stack that is getting it

? __die_body (arch/x86/kernel/dumpstack.c:421)
? die_addr (arch/x86/kernel/dumpstack.c:460)
? exc_general_protection (arch/x86/kernel/traps.c:? arch/x86/kernel/traps.c:643)
? asm_exc_general_protection (arch/x86/include/asm/idtentry.h:564)
? pick_next_task_fair (kernel/sched/sched.h:1453 kernel/sched/fair.c:8435)
? pick_next_task_fair (kernel/sched/fair.c:5463 kernel/sched/fair.c:8434)
? update_rq_clock_task (kernel/sched/core.c:?)
__schedule (kernel/sched/core.c:6022 kernel/sched/core.c:6545 kernel/sched/core.c:6691)
schedule (kernel/sched/core.c:6803 kernel/sched/core.c:6817)
syscall_exit_to_user_mode (kernel/entry/common.c:98 include/linux/entry-common.h:328 kernel/entry/common.c:201 kernel/entry/common.c:212)
do_syscall_64 (arch/x86/entry/common.c:102)
? irqentry_exit_to_user_mode (kernel/entry/common.c:228)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)

Full dmesg: https://paste.mozilla.org/RiLnt4QO#
Configs: https://paste.mozilla.org/XJ9wbdRp