Re: [RFC PATCH 0/4] timers: framework for migration between CPU

From: Ingo Molnar
Date: Mon Feb 23 2009 - 05:23:18 EST



* Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:

> * Ingo Molnar <mingo@xxxxxxx> [2009-02-23 10:11:58]:
>
> >
> > * Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > > * Ingo Molnar <mingo@xxxxxxx> [2009-02-20 22:53:18]:
> > >
> > > >
> > > > * Arjan van de Ven <arjan@xxxxxxxxxxxxx> wrote:
> > > >
> > > > > On Fri, 20 Feb 2009 17:07:37 +0100
> > > > > Ingo Molnar <mingo@xxxxxxx> wrote:
> > > > >
> > > > > >
> > > > > > * Vaidyanathan Srinivasan <svaidy@xxxxxxxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > > > I'd also suggest to not do that rather ugly
> > > > > > > > enable_timer_migration per-cpu variable, but simply reuse
> > > > > > > > the existing nohz.load_balancer as a target CPU.
> > > > > > >
> > > > > > > This is a good idea to automatically bias the timers. But
> > > > > > > this nohz.load_balancer is a very fast moving target and we
> > > > > > > will need some heuristics to estimate overall system idleness
> > > > > > > before moving the timers.
> > > > > > >
> > > > > > > I would agree that the power saving load balancer has a good
> > > > > > > view of the system and can potentially guide the timer biasing
> > > > > > > framework.
> > > > > >
> > > > > > Yeah, it's a fast moving target, but it already concentrates
> > > > > > the load somewhat.
> > > > > >
> > > > >
> > > > > I wonder if the real answer for this isn't to have timers be
> > > > > considered schedulable-entities and have the regular scheduler
> > > > > decide where they actually run.
> > > >
> > > > hm, not sure - it's a bit heavy for that.
> > > >
> > >
> > > I think the basic timer migration policy should exist in user
> > > space.
> >
> > I disagree.
> >
>
> See below
>
> > > One of the ways of looking at it is, as we begin to
> > > consolidate, using range timers and migrating all timers to
> > > lesser number of CPUs would make a whole lot of sense.
> > >
> > > As far as the scheduler making those decisions is concerned,
> > > my concern is that the load balancing is a continuous process
> > > and timers don't necessarily work that way. I'd put my neck
> > > out and say that irqbalance, range timers and timer migration
> > > should all belong to user space. irqbalance and range timers
> > > do, so should timer migration.
> >
> > As i said it my first reply, IRQ migration is special because
> > they are not kernel-internal objects, they come externally so
> > there's a lot of user-space enumeration, policy and other steps
> > involved. Furthermore, IRQs are migrated in a 'slow' fashion.
> >
> > Timers on the other hand are fast entities tied to _tasks_
> > primarily, not external entities.
>
> Timers are also queued due to external events like interrupts
> (device drivers tend to set of timers all the time). [...]

That is a silly argument. Tasks are created due to 'external
events' as well such as the user hitting a key.

What matters, and what was my argument is the distinction
whether the kernel _generates_ the event. For most IRQ events it
does not, for the overwhelming majority of timers events it
consciously generates timer events. Which makes them all the
much different.

> [...] I am not fully against what you've said, at some
> semantic level what you are suggesting is that at a higher
> level of power saving, when the scheduler balances timers it
> is doing a form of soft CPU hotplug on the system by migrating
> timers and tasks away from idle CPUs when the load can be
> handled by other CPUs. See below as well.
>
> > Hence they should migrate
> > according to the CPU where the activities of the system
> > concentrates - i.e. where tasks are running.
> >
> > Another thing: do you argue for the existing timer-migration
> > code we have in mod_timer() to move to user-space too? It isnt a
> > consistent argument to push 'some' of it to user-space, and some
> > of it in kernel-space.
> >
>
> No.. mod_timer() is correct where it belongs.

You did not reply to my statement that the argument is a double
standard. Why do certain migrations in the kernel and some not?

> Consider the powertop usage scenario today
>
> 1. Powertop displays a list of timers and common causes of wakeup
> 2. It recommends policies in user space that can affect power savings
> a. usb autosuspend
> b. wireless link management
> c. disable HAL polling

That's different - those are PowerTop timer event _reduction_
policies. Not migration policies of existing timers.

> My argument is, why can't we add
>
> d. Use range timers
> e. Consolidate timers
>
> In the future.
>
> Even sched_mc=n is set by user space, so really the
> policy is in user space.

that is different again. sched_mc is a broad switch not a
dynamic control like the sysfs migration interface that was
introduced in this patchset. Which patchset we are discussing.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/