Re: [PATCH] freezer,sched: Rewrite core freezer logic

From: Oleg Nesterov
Date: Mon Jun 14 2021 - 12:54:36 EST


On 06/14, Peter Zijlstra wrote:
>
> On Mon, Jun 14, 2021 at 05:42:47PM +0200, Oleg Nesterov wrote:
> >
> > > + /*
> > > + * If stuck in TRACED, and the ptracer is FROZEN, we're frozen too.
> > > + */
> > > + if (task_is_traced(p))
> > > + return frozen(rcu_dereference(p->parent));
> >
> > This looks racy, p->parent can resume this task and then enter
> > __refrigerator().
>
> But this is about the child, we won't report it frozen, unless the
> parent is also frozen. If the parent is frozen, it cannot resume the
> task.

Yes, but...

> The other way around, if the parent resumes the task and then gets
> frozen,

Yes ...

> then we'll wait until the task gets frozen.

how/where will we wait until the tracee gets frozen ?

Again, suppose that p->parent resumes p and gets frozen after the
task_is_traced(p) check and before the frozen(p->parent) check.

Then try_to_freeze_tasks() can succeed with todo == 0 and miss the
running "p" ?

> > > + * If stuck in STOPPED and the parent is FROZEN, we're frozen too.
> > > + */
> > > + if (task_is_stopped(p))
> > > + return frozen(rcu_dereference(p->real_parent));
> >
> > (you could use ->parent in this case too and unify this check with the
> > "traced" case above)
>
> Are you sure? The way I read the code ptrace_attach() will change
> ->parent, but STOPPED is controlled by the jobctl.

Yes, sorry I was not clear. let me add more details.

task_is_stopped() is only possible if task is not ptraced, see the
"if (!current->ptrace)" check before set_special_state(TASK_STOPPED)
in do_signal_stop(). And if the task is not traced, then
task->parent == task->real_parent.

> > I don't understand. How this connects to ->parent or ->real_parent?
> > SIGCONT can come from anywhere and wake this stopped task up?
>
> Could be me who's not understanding, I thought only the real parent
> could do that.

No, any task can do this, as long as check_kill_permission() succeeds.
Even the kernel can send SIGCONT, say, you can use F_SETSIG(SIGCONT).

> > I guess you do this to avoid freezable_schedule() in ptrace/signal_stop,
> > and we can't use TASK_STOPPED|TASK_FREEZABLE, it should not run after
> > thaw()... But see above, we can't rely on __frozen(parent).
>
> I do this because freezing puts a task in TASK_FROZEN, and that cannot
> preserve TAKS_STOPPED or TASK_TRACED without being subject to wakups

Yes, yes, this is what I tried to say.

Oleg.