Re: [PATCH v3] ptrace: disable single step in __ptrace_unlink for protecting init task
From: Oleg Nesterov
Date: Wed Oct 26 2022 - 09:57:06 EST
Chen,
I do not know what can I add to our discussion. Let me repeat once again
to summarize:
- I don't think this patch makes a lot of sense. If debugger
exits without PTRACE_DETACH it can crash the tracee in many
ways, even without PTRACE_SINGLESTEP.
- The patch is wrong in any case. If __ptrace_unlink() is called
by the exiting debugger, the child can be running. In this case
we can't use user_disable_single_step(child), it assumes this
child is frozen.
- Even if the change in __ptrace_unlink() was correct, it is racy.
__ptrace_unlink()->user_disable_single_step() can be called after
force_sig_info_to_task() was already called, but before it takes
->siglock and checks ->ptrace. In this case ->ptrace will be zero
and SIGNAL_UNKILLABLE will be cleared.
Oh. but I already said this all, but you seem to ignore. You simply keep
sending the same patch without addressing my objections. So I wrote this
stupid test-case:
#include <stdio.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <assert.h>
int main(void)
{
for (int c = 0;;c++) {
if ((c % 10000) == 0)
printf("%d\n", c);
if (fork())
wait(NULL);
else {
assert(ptrace(PTRACE_ATTACH, 1, NULL,NULL) == 0);
assert(wait(NULL) == 1);
assert(ptrace(PTRACE_SINGLESTEP, 1, NULL, SIGWINCH) == 0);
assert(wait(NULL) == 1);
assert(ptrace(PTRACE_SINGLESTEP, 1, NULL, NULL) == 0);
assert(wait(NULL) == 1);
assert(ptrace(PTRACE_SINGLESTEP, 1, NULL, NULL) == 0);
assert(wait(NULL) == 1);
assert(ptrace(PTRACE_SINGLESTEP, 1, NULL, NULL) == 0);
assert(wait(NULL) == 1);
assert(ptrace(PTRACE_SINGLESTEP, 1, NULL, NULL) == 0);
return 0;
}
}
return 0;
}
and it crashes the kernel WITH YOUR PATCH APPLIED. This proves that
the patch is racy even if the change in __ptrace_unlink() was correct.
Not sure this test-case will work on your machine, I ran it under qemu
with init=/bin/bash.
Chen, I am sorry, but I will not answer if you resend this patch again.
Oleg.
On 10/26, chen zhang wrote:
>
> I got below panic when doing fuzz test:
>
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000005
> CPU: 2 PID: 1 Comm: systemd Kdump: loaded Not tainted 6.1.0-rc1 #1
> Hardware name: LENOVO 20L5A07XCD/20L5A07XCD, BIOS N24ET56W (1.31 ) 02/19/2020
> Call Trace:
> [ 157.210356] dump_stack_lvl+0x49/0x63
> [ 157.210364] dump_stack+0x10/0x16
> [ 157.210368] panic+0x10c/0x299
> [ 157.210375] do_exit.cold+0x15/0x15
> [ 157.210381] do_group_exit+0x35/0x90
> [ 157.210386] get_signal+0x910/0x960
> [ 157.210390] ? signal_wake_up_state+0x2e/0x40
> [ 157.210396] ? complete_signal+0xd0/0x2c0
> [ 157.210402] arch_do_signal_or_restart+0x37/0x7c0
> [ 157.210408] ? send_signal_locked+0xf5/0x140
> [ 157.210416] exit_to_user_mode_prepare+0x133/0x180
> [ 157.210423] irqentry_exit_to_user_mode+0x9/0x20
> [ 157.210428] noist_exc_debug+0xea/0x150
> [ 157.210433] asm_exc_debug+0x34/0x40
> [ 157.210438] RIP: 0033:0x7fcf2a8e51c9
> [ 157.210442] Code: ff ff 73 01 c3 48 8b 0d c5 7c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 ba 00 00 00 <0f> 05 c3 0f 1f 40 00 f3 0f 1e fa b8 ea 00 00 00 0f 05 48 3d 01 f0
> [ 157.210446] RSP: 002b:00007ffd7dc44678 EFLAGS: 00000302
> [ 157.210451] RAX: 00000000000000ba RBX: 000055f7c0363170 RCX: 000055f7c04d2820
> [ 157.210454] RDX: 00000000ffffffff RSI: ffffffffffffffff RDI: 000055f7c0363170
> [ 157.210457] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000001dd0
> [ 157.210460] R10: 00007ffd7ddc9090 R11: 000000000000d7da R12: 0000000000000001
> [ 157.210463] R13: ffffffffffffffff R14: 000055f7bf3557c1 R15: 0000000000000000
>
> If a task attaches init task and is single stepping it, when this task
> exits, ptrace value will be cleaned. It causes SIGNAL_UNKILLABLE flag
> cleaned, and init task will lose the protection. Init task maybe be killed
> by SIGTRAP signal because of stepping enabled. So syscall tracing and
> stepping should be turned off for protecting init task before ptrace value
> is cleaned.
>
> Signed-off-by: chen zhang <chenzhang@xxxxxxxxxx>
> ---
> change for v2: remove _TIF_SINGLESTEP because of some architecture has not defined _TIF_SINGLESTEP such as mips.
>
> change for v3:
> Thanks for your reply and your patience. Let us discuss if the patch is useful to avoid kernel panic. In my x86 machine,
> I tested v5.4.18 kernel version, v5.15.67 kernel version and v6.1.0-rc1 kernel version.
> My test C codes include:
> r[3] = 1;
> syscall(__NR_ptrace, 0x4206, r[3], 0, 0);
> syscall(__NR_ptrace, 0x4207, r[3]);
> syscall(__NR_ptrace, 9, r[3], 0, 0);
> I tested while(1) loop and ctrl-c operation, fork and call ptrace syscall, loop 100 times syscalls and so on. The kernel will be in panic.
> For example in 5.4 kernel, trap signal is sent by two pathes:
> the one is:
> force_sig_info_to_task.cold+0x71/0x76
> force_sig_fault+0x63/0x80
> send_sigtrap+0x45/0x50
> do_debug+0x170/0x210
> debug+0x3f/0x60
> debug+0x53/0x60
> the other one is:
> force_sig_info_to_task.cold+0x84/0x8e
> force_sig_fault+0x63/0x80
> user_single_step_report+0x49/0x50
> syscall_slow_exit_work+0x75/0x150
> do_syscall_64+0x156/0x190
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> I debugged the sigtrap sending and get_signal function. When the debugger exits, the first sigtrap
> is prevent by SIGNAL_UNKILLABLE flag in get_signal function. Without my patch, the second sigtrap will be sent,
> and kernel will be in panic.
> I added my patch into the kernel:
> + if (unlikely(child->signal->flags & SIGNAL_UNKILLABLE) &&
> + unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
> + user_disable_single_step(child);
> I test the kernel with my patch, the kernel will not be in panic when I run my test program or ctrl-c the test program.
> So I think my patch is efficient to avoid kernel panic. You said:
> - debugger does ptrace(PTRACE_SINGLESTEP), this wakes the tracee up
> - the tracee enters force_sig_info_to_task(SIGTRAP) after single step
> - debugger exits, __ptrace_unlink() clears ptrace/TIF_SINGLESTEP
> - force_sig_info_to_task() clears SIGNAL_UNKILLABLE, the traced init will be killed.
>
> But run kernel with my patch, I can not find SIGTRAP signal without SIGNAL_UNKILLABLE flag for init task in get_signal function,
> so the kernel can not be in panic. Please let me verify my patch, I write a simple test, test.c includes
> r[3] = 1;
> syscall(__NR_ptrace, 0x4206, r[3], 0, 0); /* ptrace(PTRACE_SEIZE, 1, NULL, 0) */
> syscall(__NR_ptrace, 0x4207, r[3]); /* ptrace(PTRACE_INTERRUPT, 1) */
> syscall(__NR_ptrace, 9, r[3], 0, 0); /* ptrace(PTRACE_SINGLESTEP, 1, NULL, 0) */
> syscall(__NR_ptrace, 9, r[3], 0, 0); /* ptrace(PTRACE_SINGLESTEP, 1, NULL, 0) */
> sleep(10000);
> When test is sleeping, I will ctrl-c to stop the test. Do you think the traced init will be killed by SIGTRAP and my patch is unuseful?
> For verify my patch, I add three debug in the kernel of 6.1.0-rc1:
> 1. kernel/ptrace.c
> printk("user_disable_single_step will be called in __ptrace_unlink, child is %s, parent is %s, cpu is %d\n",
> child->comm, child->parent->comm, smp_processor_id());
> 2. kernel/signal.c
> bool get_signal(struct ksignal *ksig) {
> ...
> /* Trace actually delivered signals. */
> trace_signal_deliver(signr, &ksig->info, ka);
> /* chen zhang add */
> if (current->pid == 1 && signr == 5) {
> printk("current->siganl->flags is 0x%x,current->parent is %s, ptrace is %d, cpu is %d in get_signal\n",
> current->signal->flags, current->parent->comm, current->ptrace, smp_processor_id());
> if (unlikely(signal->flags & SIGNAL_UNKILLABLE) &&
> !sig_kernel_only(signr))
> printk("flages is SIGNAL_UNKILLABLE in get_signal\n");
> else
> printk("SIGNAL_UNKILLABLE is clear in get_signal\n");
>
>
> }
> /* chen zhang add end */
> if (!signr)
> break; /* will return 0 */
> ...
>
> }
> 3. kernel/signal.c
> static int
> force_sig_info_to_task(struct kernel_siginfo *info, struct task_struct *t,
> enum sig_handler handler) {
> ...
> /* chen zhang add */
> if (t->pid == 1 && sig == 5) {
> printk("init will send sigtrap in force_sig_info_to_task, ptrace is %d, parent is %s,"
> "regs eflags is 0x%lx, thread info flags is 0x%lx, cpu is %d\n",
> t->ptrace,t->parent->comm,task_pt_regs(t)->flags,
> task_thread_info(t)->flags,smp_processor_id());
> }
> /* chen zhang add end */
> if (action->sa.sa_handler == SIG_DFL &&
> (!t->ptrace || (handler == HANDLER_EXIT))) {
> /* chen zhang add */
> printk("SIGNAL_UNKILLABLE will be clear in force_sig_info_to_task, pid is %d, sig is %d\n", t->pid, sig);
> /* chen zhang add end */
> t->signal->flags &= ~SIGNAL_UNKILLABLE;
> }
> ret = send_signal_locked(sig, info, t, PIDTYPE_PID);
> ...
> }
> I installed the debug kernel with my patch and run strace ./test, dmesg:
> [ 768.416269] init will send sigtrap in force_sig_info_to_task, ptrace is 65537, parent is test,regs eflags is 0x306, thread info flags is 0x1000010, cpu is 1
> [ 768.416288] init will send sigtrap in force_sig_info_to_task, ptrace is 65537, parent is test,regs eflags is 0x306, thread info flags is 0x1000010, cpu is 1
> After ctrl-c
> [ 780.963530] user_disable_single_step will be called in __ptrace_unlink, child is systemd, parent is swapper/0, cpu is 5
> [ 780.963580] current->siganl->flags is 0x40,current->parent is swapper/0, ptrace is 0, cpu is 6 in get_signal
> [ 780.963583] flages is SIGNAL_UNKILLABLE in get_signal
>
> Kernel is not in panic. I also use orignal test program, it autogenerated by syzkaller. And it use fork and while(1) loop.
> When it run, the orignal kernel of v6.1.0-rc1 without my patch must be appear panic immediately.
> Now test my kernel, ./err_orig, run two minutes, then ctrl+c stop it, tail of dmesg:
> [ 1993.688471] user_disable_single_step will be called in __ptrace_unlink, child is systemd, parent is swapper/0, cpu is 0
> [ 1993.688514] current->siganl->flags is 0x40,current->parent is swapper/0, ptrace is 0, cpu is 7 in get_signal
> [ 1993.688516] flages is SIGNAL_UNKILLABLE in get_signal
> [ 1993.724130] init will send sigtrap in force_sig_info_to_task, ptrace is 65537, parent is err_orig,regs eflags is 0x306, thread info flags is 0x1000010, cpu is 1
> [ 1993.724299] user_disable_single_step will be called in __ptrace_unlink, child is systemd, parent is swapper/0, cpu is 5
> [ 1993.724367] current->siganl->flags is 0x40,current->parent is swapper/0, ptrace is 0, cpu is 2 in get_signal
> [ 1993.724372] flages is SIGNAL_UNKILLABLE in get_signal
>
> It seemed running well and no kernel panic happened, SIGNAL_UNKILLABLE flag always protected the init task. My patch disabled
> hardware debug on time and avoid many SIGTRIP signals to be created and sent. So my patch is useful and it can avoid kernel
> panic. At last I expect and welcome you review and verify my patch on your x86 machine. You can also give me a test case that
> can cause kernel panic, I will test it on my kernel. Thanks again!
>
> ---
> kernel/ptrace.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index 54482193e1ed..e7c41154b31e 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -130,6 +130,9 @@ void __ptrace_unlink(struct task_struct *child)
> put_cred(old_cred);
>
> spin_lock(&child->sighand->siglock);
> + if (unlikely(child->signal->flags & SIGNAL_UNKILLABLE) &&
> + unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
> + user_disable_single_step(child);
> child->ptrace = 0;
> /*
> * Clear all pending traps and TRAPPING. TRAPPING should be
> --
> 2.25.1
>