Re: [PATCH 1/2] x86/unwind/orc: recheck address range after stack info was updated

From: Peter Zijlstra
Date: Tue Apr 12 2022 - 07:27:50 EST


On Tue, Apr 12, 2022 at 10:40:03AM +0300, Dmitry Monakhov wrote:
> get_stack_info() detects stack type only by begin address, so we must
> check that address range in question is fully covered by detected stack
>
> Otherwise following crash is possible:
> -> unwind_next_frame
> case ORC_TYPE_REGS:
> if (!deref_stack_regs(state, sp, &state->ip, &state->sp))
> -> deref_stack_regs
> -> stack_access_ok <- here addr is inside stack range, but addr+len-1 is not, but we still exit with success
> *ip = READ_ONCE_NOCHECK(regs->ip); <- Here we hit stack guard fault
> OOPS LOG:
> <0>[ 1941.845743] BUG: stack guard page was hit at 000000000dd984a2 (stack is 00000000d1caafca..00000000613712f0)


> <4>[ 1941.845751] get_perf_callchain+0x10d/0x280
> <4>[ 1941.845751] perf_callchain+0x6e/0x80
> <4>[ 1941.845752] perf_prepare_sample+0x87/0x540
> <4>[ 1941.845752] perf_event_output_forward+0x31/0x90
> <4>[ 1941.845753] __perf_event_overflow+0x5a/0xf0
> <4>[ 1941.845754] perf_ibs_handle_irq+0x340/0x5b0
> <4>[ 1941.845757] perf_ibs_nmi_handler+0x34/0x60
> <4>[ 1941.845757] nmi_handle+0x79/0x190

Urgh, this is another instance of trying to unwind an IP that no longer
matches the stack.

Fixing the unwinder bug is good, but arguable we should also fix this
IBS stuff, see 6cbc304f2f36 ("perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)")