Re: posix-cpu-timers revamp

From: Roland McGrath
Date: Tue Mar 11 2008 - 17:35:24 EST


> > Not quite. check_process_timers only needs to be run once per thread
> > group (per interesting tick).
>
> Where "interesting tick" means "tick in which a process timer has
> expired," correct?

Or might have expired, in the current implementation style. Correct.

> > The process CPU timers track the process CPU clocks. [...]
>
> And my changes introduce these clocks as separate fields in the signal
> struct, updated at the tick.

Correct.

> Okay, I hadn't been clear on the distinction between process-wide and
> thread-only timers. So, really, run_posix_cpu_timers() needs to check
> both sets, the versions in the signal struct for the process-wide timers
> and the versions in the task struct for the thread-only timers.

Correct.

> I'm going to table this for now. [...]

Agreed.

> So, check_process_timers() checks for and handles any expired timers for
> the currently-running process, whereas check_thread_timers() checks for
> and handles any expired timers for the currently-running thread. Is
> that correct?

Correct.

> And, since these timers are only counting CPU time, if a thread is never
> running at the tick (since that's how we account time in the first
> place) any timers it might have will never expire.

Correct.

> At each tick a process-wide timer may have expired. Also, at each tick
> a thread-only timer may have expired. Or, of course, both. So we need
> to detect both events and fire the appropriate timer in the appropriate
> context.

Correct.

> [...] I'm pretty confident that it was
> cache conflict among the sixteen cores that did the damage.

I'm not surprised by this result. (I do want to see much more detailed
performance analysis before we decide on a final change.)

> I'm currently working on an implementation that uses the alloc_percpu()
> mechanism and a separate structure. I'm encapsulating access to the
> fields in shared_xxx_sum() inline functions, which could have different
> implementations for UP, dual-CPU and generic SMP kernels.

That is exactly what I had in mind. (I hadn't noticed alloc_percpu, and it
has one more level of indirection than I'd planned. But that wastes less
space when num_possible_cpus() is far greater than num_online_cpus(), and I
imagine it's vastly superior for NUMA.)

Don't forget do_[gs]etitimer and k_getrusage can use this too.
(Though maybe no reason to bother in k_getrusage since it has
to loop to sum the non-time counters anyway.)

> I personally think that the most promising approach is the one outlined
> above (without considering the context-switch scheme for the moment).

I tend to agree. It's the only plan I've thought through in detail.
But my remarks stand, about thorough analysis of performance impacts
of options we can think of.


Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/