Re: [PATCH 2/2] arm64: stacktrace: Add skip when task == current

From: Mark Rutland
Date: Wed Mar 17 2021 - 15:35:23 EST


On Wed, Mar 17, 2021 at 06:36:36PM +0000, Catalin Marinas wrote:
> On Wed, Mar 17, 2021 at 02:20:50PM +0000, Chen Jun wrote:
> > On ARM64, cat /sys/kernel/debug/page_owner, all pages return the same
> > stack:
> > stack_trace_save+0x4c/0x78
> > register_early_stack+0x34/0x70
> > init_page_owner+0x34/0x230
> > page_ext_init+0x1bc/0x1dc
> >
> > The reason is that:
> > check_recursive_alloc always return 1 because that
> > entries[0] is always equal to ip (__set_page_owner+0x3c/0x60).
> >
> > The root cause is that:
> > commit 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK")
> > make the save_trace save 2 more entries.
> >
> > Add skip in arch_stack_walk when task == current.
> >
> > Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK")
> > Signed-off-by: Chen Jun <chenjun102@xxxxxxxxxx>
> > ---
> > arch/arm64/kernel/stacktrace.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> > index ad20981..c26b0ac 100644
> > --- a/arch/arm64/kernel/stacktrace.c
> > +++ b/arch/arm64/kernel/stacktrace.c
> > @@ -201,11 +201,12 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
> >
> > if (regs)
> > start_backtrace(&frame, regs->regs[29], regs->pc);
> > - else if (task == current)
> > + else if (task == current) {
> > + ((struct stacktrace_cookie *)cookie)->skip += 2;
> > start_backtrace(&frame,
> > (unsigned long)__builtin_frame_address(0),
> > (unsigned long)arch_stack_walk);
> > - else
> > + } else
> > start_backtrace(&frame, thread_saved_fp(task),
> > thread_saved_pc(task));
>
> I don't like abusing the cookie here. It's void * as it's meant to be an
> opaque type. I'd rather skip the first two frames in walk_stackframe()
> instead before invoking fn().

I agree that we shouldn't touch cookie here.

I don't think that it's right to bodge this inside walk_stackframe(),
since that'll add bogus skipping for the case starting with regs in the
current task. If we need a bodge, it has to live in arch_stack_walk()
where we set up the initial unwinding state.

In another thread, we came to the conclusion that arch_stack_walk()
should start at its parent, and its parent should add any skipping it
requires.

Currently, arch_stack_walk() is off-by-one, and we can bodge that by
using __builtin_frame_address(1), though I'm waiting for some compiler
folk to confirm that's sound. Otherwise we need to add an assembly
trampoline to snapshot the FP, which is unfortunastely convoluted.

This report suggests that a caller of arch_stack_walk() is off-by-one
too, which suggests a larger cross-architecture semantic issue. I'll try
to take a look tomorrow.

Thanks,
Mark.

>
> Prior to the conversion to ARCH_STACKWALK, we were indeed skipping two
> more entries in __save_stack_trace() if tsk == current. Something like
> below, completely untested:
>
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index ad20981dfda4..2a9f759aa41a 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -115,10 +115,15 @@ NOKPROBE_SYMBOL(unwind_frame);
> void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
> bool (*fn)(void *, unsigned long), void *data)
> {
> + /* for the current task, we don't want this function nor its caller */
> + int skip = tsk == current ? 2 : 0;
> +
> while (1) {
> int ret;
>
> - if (!fn(data, frame->pc))
> + if (skip)
> + skip--;
> + else if (!fn(data, frame->pc))
> break;
> ret = unwind_frame(tsk, frame);
> if (ret < 0)
>
>
> --
> Catalin