Re: [PATCH v2 4/4] freezer,sched: Rewrite core freezer logic
From: Oleg Nesterov
Date: Wed Jul 07 2021 - 10:14:27 EST
sorry for delay...
I am still trying to understand this series, just one note for now.
On 06/24, Peter Zijlstra wrote:
>
> +static bool __freeze_task(struct task_struct *p)
> +{
> + unsigned long flags;
> + unsigned int state;
> + bool frozen = false;
> +
> + raw_spin_lock_irqsave(&p->pi_lock, flags);
> + state = READ_ONCE(p->__state);
> + if (state & (TASK_FREEZABLE|__TASK_STOPPED|__TASK_TRACED)) {
> + /*
> + * Only TASK_NORMAL can be augmented with TASK_FREEZABLE,
> + * since they can suffer spurious wakeups.
> + */
> + if (state & TASK_FREEZABLE)
> + WARN_ON_ONCE(!(state & TASK_NORMAL));
> +
> +#ifdef CONFIG_LOCKDEP
> + /*
> + * It's dangerous to freeze with locks held; there be dragons there.
> + */
> + if (!(state & __TASK_FREEZABLE_UNSAFE))
> + WARN_ON_ONCE(debug_locks && p->lockdep_depth);
> +#endif
> +
> + if (state & (__TASK_STOPPED|__TASK_TRACED))
> + WRITE_ONCE(p->__state, TASK_FROZEN|__TASK_FROZEN_SPECIAL);
Well, this doesn't look right.
Firstly, this can race with ptrace_freeze_traced() which can set
p->__state = __TASK_TRACED and clear TASK_FROZEN. Or with
__set_current_state(TASK_RUNNING) in ptrace_stop().
But the main problem is that you can't simply remove __TASK_TRACED,
this can confuse the debugger, any ptrace() request will fail as if
the tracee was killed.
Another problem. Suppose that p->parent sleeps in do_wait(). p calls
ptrace_stop(), sets __TASK_TRACED, and wakes the parent up.
__freeze_task() clears __TASK_TRACED.
The parent calls wait_task_stopped(p) but it fails because
task_is_traced() returns false. The parent sleeps again, and forever
because __thaw_special() won't notify it.
Or. Suppose that __freeze_task() removes __TASK_STOPPED. The new
debugger comes, the tracee should switch from STOPPED to TRACED. But
this won't happen because task_is_stopped() in ptrace_() will return
false and task_set_jobctl_pending/signal_wake_up_state won't be called.
Oleg.