Re: posix-cpu-timers revamp

From: Roland McGrath
Date: Fri Apr 04 2008 - 19:18:21 EST


> BTW I did look at allocating it in posix_cpu_timer_set() and
> set_process_cpu_timer() but the first at least is doing stuff with locks
> held. I'll keep looking at it, though.

Yeah, it's a little sticky. It would be simple enough to arrange things to
do the allocation when needed before taking locks, but would not make the
code wonderfully self-contained. I wouldn't worry about it unless/until we
conclude for other reasons that this really is the best way to go. We seem
now to be leaning towards allocating at clone time anyway.

> One little gotcha we just ran into, though: When checking
> tsk->signal->(anything) in run_posix_cpu_timers(), we have to hold
> tasklist_lock to avoid a race with release_task(). This is going to
> make even the null case always cost more than before.

This is reminiscent of something that came up once before. I think it
was the same issue of what happens on a tick while the thread is in the
middle of exiting and racing with release_task. See commit
72ab373a5688a78cbdaf3bf96012e597d5399bb7, commit
3de463c7d9d58f8cf3395268230cb20a4c15bffa and related history (some of
the further history is pre-GIT).

> For the local environment, I solved the problem by moving the percpu
> structure out of the signal structure entirely and by making it
> refcounted.

This is a big can of worms that we really don't need. Complicating the
data structure handling this way is really not warranted at all just to
address this race. You'll just create another version of the same race
with a different pointer, and then solve it some simple way that you
could have just used to solve the existing problem. If you don't have
some independent (and very compelling) reasons to reorganize the data
structures, nix nix nix.

We can make posix_cpu_timers_exit() or earlier in the exit/reap path
tweak any state we need to ensure that this problem won't come up.
But, off hand, I don't think we need any new state.

Probably the right fix is to make the fast-path check do:

rcu_read_lock();
signal = rcu_dereference(current->signal);
if (unlikely(!signal) || !fastpath_process_timer_check(signal)) {
rcu_read_unlock();
return;
}
sighand = lock_task_sighand(current, &flags);
rcu_read_unlock();
if (likely(sighand))
slowpath_process_timer_work(signal);
unlock_task_sighand(sighand, &flags);

Another approach that is probably fine too is just to do:

if (unlikely(current->exit_state))
return;

We can never get to the __exit_signal code that causes the race if we
are not already late in exit. The former seems a little preferable
because the added fast-path cost is the same (or perhaps even more
cache-friendly), but it fires timers even at the very last tick during
exit, and only loses the ideal behavior (of always firing if the timer
expiration is ever crossed) at the last possible instant.


Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/