Re: [PATCH v2 1/5] sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state

From: Eric W. Biederman
Date: Tue Apr 26 2022 - 19:34:27 EST


Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:

> Currently ptrace_stop() / do_signal_stop() rely on the special states
> TASK_TRACED and TASK_STOPPED resp. to keep unique state. That is, this
> state exists only in task->__state and nowhere else.
>
> There's two spots of bother with this:
>
> - PREEMPT_RT has task->saved_state which complicates matters,
> meaning task_is_{traced,stopped}() needs to check an additional
> variable.
>
> - An alternative freezer implementation that itself relies on a
> special TASK state would loose TASK_TRACED/TASK_STOPPED and will
> result in misbehaviour.
>
> As such, add additional state to task->jobctl to track this state
> outside of task->__state.
>
> NOTE: this doesn't actually fix anything yet, just adds extra state.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>

> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -770,7 +773,9 @@ void signal_wake_up_state(struct task_st
> * By using wake_up_state, we ensure the process will wake up and
> * handle its death signal.
> */
> - if (!wake_up_state(t, state | TASK_INTERRUPTIBLE))
> + if (wake_up_state(t, state | TASK_INTERRUPTIBLE))
> + t->jobctl &= ~(JOBCTL_STOPPED | JOBCTL_TRACED);
> + else
> kick_process(t);
> }

This hunk is subtle and I don't think it is actually what we want if the
code is going to be robust against tsk->__state becoming TASK_FROZEN.

I think we want the clearing of JOBCTL_STOPPED and JOBCTL_TRACED
to be independent of what tsk->__state and tsk->saved_state are.

Something like:

static inline void signal_wake_up(struct task_struct *t, bool resume)
{
unsigned int state = 0;
if (resume && !(t->jobctl & JOBCTL_DELAY_WAKEKILL)) {
t->jobctl &= ~(JOBCTL_STOPPED | JOBCTL_TRACED);
state = TASK_WAKEKILL;
}
signal_wake_up_state(t, state);
}

static inline void ptrace_signal_wake_up(struct task_struct *t, bool resume)
{
unsigned int state = 0;
if (resume) {
t->jobctl &= ~JOBCTL_TRACED;
state = __TASK_TRACED;
}
signal_wake_up_state(t, state);
}

That would allow __set_task_special in the final patch to look like:

/*
* The special task states (TASK_STOPPED, TASK_TRACED) keep their canonical
* state in p->jobctl. If either of them got a wakeup that was missed because
* TASK_FROZEN, then their canonical state reflects that and the below will
* refuse to restore the special state and instead issue the wakeup.
*/
static int __set_task_special(struct task_struct *p, void *arg)
{
unsigned int state = 0;

if (p->jobctl & JOBCTL_TRACED)
state = TASK_TRACED;

else if (p->jobctl & JOBCTL_STOPPED)
state = TASK_STOPPED;

if (state)
WRITE_ONCE(p->__state, state);

return state;
}


With no need to figure out if a wake_up was dropped and reverse engineer
what the wakeup was.

Eric