Re: [PATCH] signal: don't always leave task frozen after ptrace_stop()
From: Oleg Nesterov
Date: Tue May 14 2019 - 12:03:46 EST
Roman,
Sorry, I can't agree with this patch. And even the changelog doesn't
look right.
On 05/13, Roman Gushchin wrote:
>
> The ptrace_stop() function contains the cgroup_enter_frozen() call,
> but no cgroup_leave_frozen(). When ptrace_stop() is called from the
> do_jobctl_trap() path, it's correct, because corresponding
> cgroup_leave_frozen() calls in get_signal() will guarantee that
> the task won't leave the signal handler loop frozen.
>
> However, if ptrace_stop() is called from ptrace_signal() or
> ptrace_notify(), there is no such guarantee, and the task may leave
> with the frozen bit set.
ptrace_signal() looks fine in that the task can't return to user-mode,
get_signal() will be called again exactly because ->frozen == 1 means
TIF_SIGPENDING. So I an not surre I understand why ptrace_signal() does
ptrace_stop(false) with your patch. But this is minor.
> It leads to the regression, reported by Alex Xu. Write system call
> gets mistakenly interrupted by fake TIF_SIGPENDING, which is set
> by recalc_sigpending_tsk() because of the set frozen bit.
IMHO, the real problem is not that syscall was interrupted. The problem
is that a frozen task must never start the syscall.
---------------------------------------------------------------------------
Can't we add the unconditional leave_frozen() into ptrace_stop() for now ?
Sure, this is not what we want. Debugger can disturb CGRP_FROZEN.
But. The "may_remain_frozen" argument uglifies this code too much (imo) and
at the same time it doesn't solve the problem above: CGRP_FROZEN can be cleared
"for no reason".
Say, why ptrace_event_pid() should do leave_frozen(true) ? And if there is any
reason, then why wait_for_vfork_done() can do leave_frozen(false) ?
Or syscall-exit path. It can't miss get_signal(), it doesn't need leave_frozen().
In short, I believe that compared to the unconditional leave_frozen() in ptrace_stop()
this patch buys almost nothing, but makes the code and the whole logic much uglier.
Oleg.