Re: RFC: mixing device idle and CPUidle or non-atomic idle notifiers

From: Rafael J. Wysocki
Date: Tue Sep 28 2010 - 18:25:16 EST


On Saturday, September 25, 2010, Kevin Hilman wrote:
> Now that we have runtime PM for devices, I'm exploring ways of how to
> couple the runtime PM of certain devices with CPUidle transitions.
> Ideally, CPUidle should only manage CPU idle states, and device idle
> states would be managed separately using runtime PM. However, there are
> cases where the device idle transistions need to be coordinated with CPU
> idle transistions. This is already a proposed topic for the PM
> mini-conf at Plumbers'[1], so this RFC is to get the discussion started.

OK

> In the wild west (before runtime PM), we managed these special cases on
> OMAP by having some special hacks^Whooks for certain drivers that were
> called during idle. When these devices are converted to using runtime
> PM, ideally we'd like initiate device runtime PM transitions for these
> devices somehow coordinated with CPU idle transitions.
>
> So, I started to explore how to coordinate device runtime PM transitions
> with CPU idle transitions.
>
> One of the fundamental problems is that by the time CPUidle is entered,
> interrupts are already disabled, and runtime PM cannot be used from
> interrupts disabled context (c.f. thread on linux-pm[1].)

This issue should be addressed by Alan, by adding the new flag to struct
dev_pm_info that will tell the runtime PM framework that to work with the
assumption that interrupts are off.

> So that led me down the path of exploring whether we really need to have
> interrupts disabled during the early part of CPUidle. It seems to me
> that during the time when the governor is selecting a state, and when
> the platform-specific code is checking for device/bus activity,
> interrupts do not really need to be disabled yet. At least, I didn't
> come up with a good reason why they need to be disabled so early, hence
> the RFC.
>
> Here's a simplified version how it works today:
>
> /* arch/arm/kernel/process.c, arch/x86/kernel/process_*.c */
> cpu_idle()
> local_irq_disable()
> pm_idle() --> cpuidle_idle_call()
>
> cpuidle_idle_call()
> dev->prepare()
> target_state = governor->select() /* selects next state */
> target_state->enter()
> /* the ->enter hook must enable IRQs before returning */
>
> As a quick hack, I just (re)enabled interrupts in our CPUidle
> ->prepare() hook (they're later disabled again before the core idle is
> run.) This allowed the calling of device-specific idle functions which
> then use runtime PM and thus allows device-specific idle to be
> coordinated with the CPU idle.
>
> So back to the main question... do we really need interrupts disabled so
> early in the idle path?
>
> I'm sure I'm missing something obvious about why this can't work, but
> it's Friday and my brain prefers to think about beer rather than
> CPUidle.
>
> Or, as another potential option...
>
> I just discovered that x86_64 has an atomic idle_notifier called just
> before idle (c.f. arch/x86/kernel/process_64.c.) However this is also
> done with interrupts disabled, so using this has the same problems with
> interrupts disabled. But, what about adding an additional notifier
> chain that happens with interrupts still enabled.... hmm, will
> ponder that over that beer...

I must admit I haven't looked very deeply into the cpuidle code, but
certainly there are good reasons to make it collaborate with the I/O runtime PM.

It would be good to know if we can relax the handling of interrupts in the
cpuidle framework a bit, this way or another.

[Added a few CCs to people that may be interested.]

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/