Re: [RFC][PATCH 0/5] Signal scalability series

From: Matt Fleming
Date: Tue Oct 04 2011 - 04:56:32 EST


On Mon, 2011-10-03 at 15:16 +0200, Oleg Nesterov wrote:
> Why do we have? Usually SIGCONT is ignored. But this doesn't matter,
> SIGCONT acts at the sending time.
>
> If SIGCONT is sent - the process must not stop. Since we drop the lock
> we can't guarantee this.

OK, I see, thanks.

> > > May be do_signal_stop() does something special? At first flance it doesn't.
> > > But wait, it does while_each_thread() under ->ctrl_lock, why this is safe?
> >
> > Why is it not safe? What scenario are you thinking of where that isn't
> > safe?
>
> This series doesn't add ->ctrl_lock into copy_process/__unhash_process
> or I misread the patches. This means we can't trust >thread_group list.

*facepalm*

Arrrrggghh! This is why I complain about sighand->siglock protecting too
much, I didn't even _REALISE_ it protected the ->thread_group list.
Thanks for pointing that out, Oleg!

> Even this is safe (say, we can rely on rcu), we can't calculate
> ->group_stop_count correctly. In particular, without ->siglock we can
> race with exit_signals() which sets PF_EXITING. Note that PF_EXITING
> check in task_set_jobctl_pending() is important.

Ah, I think it was these lines that confused me into thinking
->ctrl_lock wasn't required around PF_EXITING,

void exit_signals(struct task_struct *tsk)
{
int group_stop = 0;
sigset_t unblocked;

if (thread_group_empty(tsk) || signal_group_exit(tsk->signal)) {
tsk->flags |= PF_EXITING;
return;
}

But I guess that's safe because either we're the only thread in the
group or the group is already going to exit?

--
Matt Fleming, Intel Open Source Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/