Re: [PATCH] Migration of Standard Timers

From: Andrew Morton
Date: Fri Sep 22 2006 - 15:17:27 EST


On Tue, 19 Sep 2006 10:29:42 -0500
Dimitri Sivanich <sivanich@xxxxxxx> wrote:

> This patch allows the user to migrate currently queued
> standard timers from one cpu to another. Migrating
> timers off of select cpus allows those cpus to run
> time critical threads with minimal timer induced latency
> (which can reach 100's of usec for a single timer as shown
> on an X86_64 test machine), thereby improving overall
> determinance on those selected cpus.
>
> This patch considers timers placed with add_timer_on()
> to have 'cpu affinity' and does not move them, unless the
> timers are being migrated off of a hotplug cpu that is
> going down.
>
> The changes in drivers/base/cpu.c provide a clean and
> convenient interface for triggering the migration through
> sysfs, via writing the destination cpu number to an owner
> writeable file (0200 permissions) associated with the source
> cpu. In additon, this functionality is available for kernel
> module use.
>
> Note that migrating timers will not, by itself, keep new
> timers off of the chosen cpu. But with careful control of
> thread affinity, one can control the affinity of new timers
> and keep timer induced latencies off of the chosen cpu.
>
> This particular patch does not affect the hrtimers. That
> could be addressed later.

I can't say I like this, sorry.

- It adds another word to the timer_list structure for a very obscure
application. And a lot of kernel data structures aggregate timer_lists.

- There are places in the kernel which assume that once a timer is added
on a CPU, it will stay there. The timer handler re-arms the timer,
confident in the knowledge that everything stays on this CPU.

I recall working on such code a couple of years ago, but I now forget where
it was.

It is reasonable to do add_timer() within the CPU_ONLINE handler (for
example), in the expectation that that timer will fire on this CPU.

The proposed change permits the administrator to break that assumption.
We would need to hunt down any code which makes that assumption and
convert it to add_timer_on(). And we'd need to be very vigilant in the
future, since people could easily add new code which had the old
assumption which worked just fine for them in testing.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/