Re: [PATCH 2/2] posix-cpu-timers: use ->sighand instead of->signal to check the task is alive

From: Oleg Nesterov
Date: Thu Feb 05 2009 - 10:58:31 EST


On 02/04, Roland McGrath wrote:
>
> > It doesn't matter which pointer to check under tasklist to ensure the task
> > was not released, ->signal or ->sighand. But we are going to make ->signal
> > refcountable, change the code to use ->sighand.
>
> I haven't been following what that's about (signal_struct already has two
> atomic counts!).

We can't use them as refcounts. You can't bump ->live or ->count without
breaking group_dead or exec logic. Perhaps we can use ->count, but then
we need other changes. But this has nothing to do with this patch.

The goal is to keep task->signal after release_task(), it will be freed
by __put_task_struct(). This allows a lot of simplifications and we can
move some fields from task_struct to signal_struct.

But first we should change the code which does

lock(tasklist);
if (task->signal != NULL)
/* Great! the task was not released */
do_something(task);

This patch does not change the current behaviour, but makes possible
to change the "scope" of ->signal without breaking the current code.

> Uses here protecting cpu_clock_sample_group() e.g., are
> around looking at ->signal->foobar, so if ->signal is still there, why not
> look at it and be able to get the sample in whatever small window this is?

What if arm_timer() sees ->signal != NULL, proceeds, and attaches the
timer to the signal_struct of the already dead task? This signal_strcut
will be released with the pending timer.

Even cpu_clock_sample_group() is not safe, unless we add other changes.

> I don't really understand what this new case might mean though. Most
> things that look at ->signal need to lock it, so access doesn't make any
> sense if there is no siglock because ->sighand is clear while ->signal is not.

For example. __sched_setscheduler() needs to read a single word,
signal->rlim[RLIMIT_RTPRIO].rlim_cur, but it has to lock ->siglock
to access ->signal.

Worse. Please look at __exit_signal()->task_rq_unlock_wait(). This hack
is absolutely stupid and mmust die.


But in any case. Even if we don't need the further changes, do you
agree this patch is correct and doesn't change the behaviour?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/