Re: [PATCH v12 3/4] cpuidle: Export the next timer/tick expiration for a CPU

From: Ulf Hansson
Date: Mon Mar 25 2019 - 10:24:06 EST


On Mon, 25 Mar 2019 at 13:21, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
>
> On Wednesday, February 27, 2019 8:58:35 PM CET Ulf Hansson wrote:
> > To be able to predict the sleep duration for a CPU that is entering idle,
> > knowing when the next timer/tick is going to expire, is extremely useful.
> > Both the teo and the menu cpuidle governors already makes use of this
> > information, while selecting an idle state.
> >
> > Moving forward, the similar prediction needs to be done, but for a group of
> > idle CPUs rather than for a single idle CPU. Following changes implements a
> > new genpd governor, which needs this.
> >
> > Support this, by sharing a new function called
> > tick_nohz_get_next_hrtimer(), which returns the next hrtimer or the next
> > tick, whatever that expires first.
> >
> > Additionally, when cpuidle is about to invoke the ->enter() callback, then
> > call tick_nohz_get_next_hrtimer() and store its return value in the per CPU
> > struct cpuidle_device, as to make it available outside cpuidle.
> >
> > Do note, at the point when cpuidle calls tick_nohz_get_next_hrtimer(), the
> > governor's ->select() callback has already made a decision whether to stop
> > the tick or not. In this way, tick_nohz_get_next_hrtimer() actually returns
> > the next timer expiration, whatever origin.
> >
> > Cc: Lina Iyer <ilina@xxxxxxxxxxxxxx>
> > Co-developed-by: Lina Iyer <lina.iyer@xxxxxxxxxx>
> > Co-developed-by: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>
> > Signed-off-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> > ---
> >
> > Changes in v12:
> > - New patch.
> >
> > ---
> > drivers/cpuidle/cpuidle.c | 8 ++++++++
> > include/linux/cpuidle.h | 1 +
> > include/linux/tick.h | 7 ++++++-
> > kernel/time/tick-sched.c | 12 ++++++++++++
> > 4 files changed, 27 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> > index 7f108309e871..255365b1a6ab 100644
> > --- a/drivers/cpuidle/cpuidle.c
> > +++ b/drivers/cpuidle/cpuidle.c
> > @@ -328,6 +328,14 @@ int cpuidle_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
> > int cpuidle_enter(struct cpuidle_driver *drv, struct cpuidle_device *dev,
> > int index)
> > {
> > + /*
> > + * Store the next hrtimer, which becomes either next tick or the next
> > + * timer event, whatever expires first. Additionally, to make this data
> > + * useful for consumers outside cpuidle, we rely on that the governor's
> > + * ->select() callback have decided, whether to stop the tick or not.
> > + */
> > + dev->next_hrtimer = tick_nohz_get_next_hrtimer();
>
> I would use WRITE_ONCE() to set next_hrtimer here and READ_ONCE() for
> reading that value in the next patch, as a matter of annotation if
> nothing else.

Okay!

>
> > +
> > if (cpuidle_state_is_coupled(drv, index))
> > return cpuidle_enter_state_coupled(dev, drv, index);
> > return cpuidle_enter_state(dev, drv, index);
>
> Also I would clear next_hrtimer here to avoid dragging stale values
> around.

Right, I can do that.

However, at least in my case it would be an unnecessary update of the
variable, as I am never in a path where the value can be "stale". Even
if one theoretically could use a stale value, it's seems likely to not
be an issue, don't you think? Anyway, if I don't hear from you, I do
the change as you suggested.

>
> Apart from this the series LGTM.

Great, thanks. I re-spin a new version.

Kind regards
Uffe