Re: [PATCH] powerpc/irq: Increase stack_overflow detection limit when KASAN is enabled

From: Christophe Leroy
Date: Wed Jun 01 2022 - 11:33:51 EST




Le 31/05/2022 à 08:21, Michael Ellerman a écrit :
> Christophe Leroy <christophe.leroy@xxxxxxxxxx> writes:
>> When KASAN is enabled, as shown by the Oops below, the 2k limit is not
>> enough to allow stack dump after a stack overflow detection when
>> CONFIG_DEBUG_STACKOVERFLOW is selected:
>>
>> do_IRQ: stack overflow: 1984
>> CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1
>> Call Trace:
>> Oops: Kernel stack overflow, sig: 11 [#1]
>> BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
>> Modules linked in: sr_mod cdrom radeon(+) ohci_pci(+) hwmon i2c_algo_bit drm_ttm_helper ttm drm_dp_helper snd_aoa_i2sbus snd_aoa_soundbus snd_pcm ehci_pci snd_timer ohci_hcd snd ssb ehci_hcd 8250_pci soundcore drm_kms_helper pcmcia 8250 pcmcia_core syscopyarea usbcore sysfillrect 8250_base sysimgblt serial_mctrl_gpio fb_sys_fops usb_common pkcs8_key_parser fuse drm drm_panel_orientation_quirks configfs
>> CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1
>> NIP: c02e5558 LR: c07eb3bc CTR: c07f46a8
>> REGS: e7fe9f50 TRAP: 0000 Not tainted (5.18.0-gentoo-PMacG4)
>> MSR: 00001032 <ME,IR,DR,RI> CR: 44a14824 XER: 20000000
>>
>> GPR00: c07eb3bc eaa1c000 c26baea0 eaa1c0a0 00000008 00000000 c07eb3bc eaa1c010
>> GPR08: eaa1c0a8 04f3f3f3 f1f1f1f1 c07f4c84 44a14824 0080f7e4 00000005 00000010
>> GPR16: 00000025 eaa1c154 eaa1c158 c0dbad64 00000020 fd543810 eaa1c0a0 eaa1c29e
>> GPR24: c0dbad44 c0db8740 05ffffff fd543802 eaa1c150 c0c9a3c0 eaa1c0a0 c0c9a3c0
>> NIP [c02e5558] kasan_check_range+0xc/0x2b4
>> LR [c07eb3bc] format_decode+0x80/0x604
>> Call Trace:
>> [eaa1c000] [c07eb3bc] format_decode+0x80/0x604 (unreliable)
>> [eaa1c070] [c07f4dac] vsnprintf+0x128/0x938
>> [eaa1c110] [c07f5788] sprintf+0xa0/0xc0
>> [eaa1c180] [c0154c1c] __sprint_symbol.constprop.0+0x170/0x198
>> [eaa1c230] [c07ee71c] symbol_string+0xf8/0x260
>> [eaa1c430] [c07f46d0] pointer+0x15c/0x710
>> [eaa1c4b0] [c07f4fbc] vsnprintf+0x338/0x938
>> [eaa1c550] [c00e8fa0] vprintk_store+0x2a8/0x678
>> [eaa1c690] [c00e94e4] vprintk_emit+0x174/0x378
>> [eaa1c6d0] [c00ea008] _printk+0x9c/0xc0
>> [eaa1c750] [c000ca94] show_stack+0x21c/0x260
>> [eaa1c7a0] [c07d0bd4] dump_stack_lvl+0x60/0x90
>> [eaa1c7c0] [c0009234] __do_IRQ+0x170/0x174
>> [eaa1c800] [c0009258] do_IRQ+0x20/0x34
>> [eaa1c820] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
>
> Is this actually caused by KASAN? There's no stack frames in there that
> are KASAN related AFAICS.
>
> Seems like the 2K limit is never going to be enough even if KASAN is not
> enabled. Presumably we just haven't noticed because we don't trigger the
> check unless KASAN is enabled.

I made some test on PPC32.

Without KASAN, I can call dump_stack() until the stack has at least 1120
bytes available on stack.

With KASAN I can call dump_stack() until the stack has at least 2096
bytes available on stack.

>
>> ...
>>
>> Increase the limit to 3k when KASAN is enabled.
>>
>> While at it remove the 'inline' keywork for check_stack_overflow().
>> This function is called only once so it will be inlined regardless.
>
> I'd rather that was a separate change, in case it has some unintended
> affect.
>
>> Reported-by: Erhard Furtner <erhard_f@xxxxxxxxxxx>
>> Cc: Arnd Bergmann <arnd@xxxxxxxx>
>> Signed-off-by: Christophe Leroy <christophe.leroy@xxxxxxxxxx>
>> ---
>> arch/powerpc/kernel/irq.c | 16 ++++++++++------
>> 1 file changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
>> index 873e6dffb868..5ff4cf69fc2f 100644
>> --- a/arch/powerpc/kernel/irq.c
>> +++ b/arch/powerpc/kernel/irq.c
>> @@ -53,6 +53,7 @@
>> #include <linux/vmalloc.h>
>> #include <linux/pgtable.h>
>> #include <linux/static_call.h>
>> +#include <linux/sizes.h>
>>
>> #include <linux/uaccess.h>
>> #include <asm/interrupt.h>
>> @@ -184,7 +185,7 @@ u64 arch_irq_stat_cpu(unsigned int cpu)
>> return sum;
>> }
>>
>> -static inline void check_stack_overflow(void)
>> +static void check_stack_overflow(void)
>> {
>> long sp;
>>
>> @@ -193,11 +194,14 @@ static inline void check_stack_overflow(void)
>>
>
> Wouldn't it be cleaner to just do:
>
> #ifdef CONFIG_KASAN
> #define STACK_CHECK_LIMIT (3 * 1024)
> #else
> #define STACK_CHECK_LIMIT (2 * 1024)
> #endif
>
>> sp = current_stack_pointer & (THREAD_SIZE - 1);
>>
>> - /* check for stack overflow: is there less than 2KB free? */
>> - if (unlikely(sp < 2048)) {
>
> + if (unlikely(sp < STACK_CHECK_LIMIT)) {
>
> And then the code could stay as it is?
>
> cheers
>
>> - pr_err("do_IRQ: stack overflow: %ld\n", sp);
>> - dump_stack();
>> - }
>> + /* check for stack overflow: is there less than 2/3KB free? */
>> + if (!IS_ENABLED(KASAN) && likely(sp >= SZ_2K))
>> + return;
>> + if (IS_ENABLED(KASAN) && likely(sp >= SZ_2K + SZ_1K))
>> + return;
>> +
>> + pr_err("do_IRQ: stack overflow: %ld\n", sp);
>> + dump_stack();
>> }