Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals
From: Oleg Nesterov
Date: Wed Aug 17 2005 - 09:24:31 EST
Paul E. McKenney wrote:
>
> > The other thing that jumped out at me is that signals are very different
> > animals from a locking viewpoint depending on whether they are:
> >
> > 1. ignored,
> >
> > 2. caught by a single thread,
> >
> > 3. fatal to multiple threads/processes (though I don't know
> > of anything that shares sighand_struct between separate
> > processes), or
> >
> > 4. otherwise global to multiple threads/processes (such as
> > SIGSTOP and SIGCONT).
> >
> > And there are probably other distinctions that I have not yet caught
> > on to.
> >
> > One way to approach this would be to make your suggested lock_task_sighand()
> > look at the signal and acquire the appropriate locks. If, having acquired
> > a given set of locks, it found that the needed set had changed (e.g., due
> > to racing exec() or sigaction()), then it drops the locks and retries.
>
> OK, for this sort of approach to work, lock_task_sighand() must take and
> return some sort of mask indicating what locks are held. The mask returned
> by lock_task_sighand() would then be passed to an unlock_task_sighand().
Sorry, I don't understand you. CLONE_THREAD implies CLONE_SIGHAND,
so we always need to lock one ->sighand. Could you please clarify?
> > > + if (!ret && sig && (sp = p->sighand)) {
> > > if (!get_task_struct_rcu(p)) {
> > > return -ESRCH;
> > > }
> > > - spin_lock_irqsave(&p->sighand->siglock, flags);
> > > + spin_lock_irqsave(&sp->siglock, flags);
> > > + if (p->sighand != sp) {
> > > + spin_unlock_irqrestore(&sp->siglock, flags);
> > > + put_task_struct(p);
> > > + goto retry;
> > > + }
> > > ret = __group_send_sig_info(sig, info, p);
> > > - spin_unlock_irqrestore(&p->sighand->siglock, flags);
> > > + spin_unlock_irqrestore(&sp->siglock, flags);
> > > put_task_struct(p);
> >
> > Do we really need get_task_struct_rcu/put_task_struct here?
> >
> > The task_struct can't go away under us, it is rcu protected.
> > When ->sighand is locked, and it is still the same after
> > the re-check, it means that 'p' has not done __exit_signal()
> > yet, so it is safe to send the signal.
> >
> > And if the task has ->usage == 0, it means that it also has
> > ->sighand == NULL, and your code will notice that.
> >
> > No?
>
> Seems plausible. I got paranoid after seeing the lock dropped in
> handle_stop_signal(), though.
Yes, this is bad and should be fixed, I agree.
But why do you think we need to bump task->usage? It can't make any
difference, afaics. The task_struct can't dissapear, the caller was
converted to use rcu_read_lock() or it holds tasklist_lock.
Nonzero task_struct->usage can't stop do_exit or sys_wait4, it will
only postpone call_rcu(__put_task_struct_cb).
And after we locked ->sighand we have sufficient memory barrier, so
if we read the stale value into 'sp' we will notice that (if you were
worried about this issue).
Am I missed something?
> void exit_sighand(struct task_struct *tsk)
> {
> write_lock_irq(&tasklist_lock);
> - __exit_sighand(tsk);
> + spin_lock(&tsk->sighand->siglock);
> + if (tsk->sighand != NULL) {
> + __exit_sighand(tsk);
> + }
> + spin_unlock(&tsk->sighand->siglock);
> write_unlock_irq(&tasklist_lock);
> }
Very strange code. Why this check? And what happens with
spin_unlock(&tsk->sighand->siglock);
when tsk->sighand == NULL ?
Oleg.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/