Re: [PATCH 1/2] alpha/ptrace: Record and handle the absence of switch_stack

From: Michael Schmitz
Date: Mon Jun 21 2021 - 01:32:15 EST


Hi Al,

Am 21.06.2021 um 15:44 schrieb Al Viro:
On Mon, Jun 21, 2021 at 03:18:35PM +1200, Michael Schmitz wrote:

This is what I get from WARN_ONCE:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 1177 at arch/m68k/kernel/ptrace.c:91 get_reg+0x90/0xb8
Modules linked in:
CPU: 0 PID: 1177 Comm: strace Not tainted 5.13.0-rc1-atari-fpuemu-exitfix+
#1146
Stack from 014b7f04:
014b7f04 00336401 00336401 000278f0 0032c015 0000005b 00000005
0002795a
0032c015 0000005b 0000338c 00000009 00000000 00000000 ffffffe4
00000005
00000003 00000014 00000003 00000014 efc2b90c 0000338c 0032c015
0000005b
00000009 00000000 efc2b908 00912540 efc2b908 000034cc 00912540
00000005
00000000 efc2b908 00000003 00912540 8000110c c010b0a4 efc2b90c
0002d1d8
00912540 00000003 00000014 efc2b908 0000049a 00000014 efc2b908
800acaa8
Call Trace: [<000278f0>] __warn+0x9e/0xb4
[<0002795a>] warn_slowpath_fmt+0x54/0x62
[<0000338c>] get_reg+0x90/0xb8
[<0000338c>] get_reg+0x90/0xb8
[<000034cc>] arch_ptrace+0x7e/0x250
[<0002d1d8>] sys_ptrace+0x232/0x2f8
[<00002ab6>] syscall+0x8/0xc
[<0000c00b>] lower+0x7/0x20

---[ end trace ee4be53b94695793 ]---

Syscall numbers are actually 90 and 192 - sys_old_mmap and sys_mmap2 on
m68k. Used the calculator on my Ubuntu desktop, that appears to be a little
confused about hex to decimal conversions.

I hope that makes more sense?

Not really; what is the condition you are checking? The interesting trace

The check in get_reg() is:


if (WARN_ON_ONCE((off < PT_REG(d1)) &&
test_ti_thread_status(task_thread_info(task),TIS_TRACING)
&& !test_ti_thread_status(task_thread_info(task),
TIS_ALLREGS_SAVED))) {
unsigned long *addr_d0;
addr_d0 = (unsigned long *)(task->thread.esp0 + regoff[16]);
pr_err("get_reg with incomplete stack, regno %d offs %d orig_d0 %lx\n", regno, off, *addr_d0);
return 0;
}


is not that with get_reg() - it's that of the process being traced. You
are not accessing the stack of caller of ptrace(2) here, so you want to
know that SAVE_SWITCH_STACK had been done by the tracee, not tracer.

And if that had been strace ls, you have TIF_SYSCALL_TRACE set for ls, so
* ls hits system_call
* notices TIF_SYSCALL_TRACE and goes to do_trace_entry
* does SAVE_SWITCH_STACK there

... and sets both the new TIS_TRACING and TIS_ALLREGS_SAVED flags in the thread_info->status field (now that I've corrected my patch).

* calls syscall_trace(), which calls ptrace_notify()
* ptrace_notify() calls ptrace_do_notify(), which calls ptrace_stop()
* ptrace_stop() arranges for tracer to be woken up and gives CPU up,
with TASK_TRACED as process state.

Thanks for explaining! So in order to get a trace for the process being traced, I would have to check the TIS_ALLREGS_SAVED in ptrace_stop()?

That's the callchain in ls, and switch_stack accessed by get_reg() from
strace is the one on ls(1) stack created by SAVE_SWITCH_STACK.

So testing for TIS_ALLREGS_SAVED in get_reg() (called by the tracer, but with the tracee's task struct passed to arch_ptrace()) does check that SAVE_SWITCH_STACK was done before the syscall in the tracee, right?

Anyway, I'd missed setting the flags for some crucial SAVE_SWITCH_STACK operations in my woefully incomplete patch. With that corrected, there's no more warning from mmap. I'll try with a more recent version of strace and gdb once I've updated my test image.

Cheers,

Michael