Re: [PATCH 3/6] timekeeping: Make it safe to use the fast timekeeper while suspended

From: Travis
Date: Thu Feb 12 2015 - 22:08:44 EST


Sounds good to me!

On Feb 12, 2015 8:03 PM, "Rafael J. Wysocki" <rjw@xxxxxxxxxxxxx> wrote:
>
> On Friday, February 13, 2015 08:53:38 AM John Stultz wrote:
> > On Wed, Feb 11, 2015 at 12:03 PM, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > >
> > > Theoretically, ktime_get_mono_fast_ns() may be executed after
> > > timekeeping has been suspended (or before it is resumed) which
> > > in turn may lead to undefined behavior, for example, when the
> > > clocksource read from timekeeping_get_ns() called by it is
> > > not accessible at that time.
> >
> > And the callers of the ktime_get_mono_fast_ns() have to get back a
> > value?
>
> Yes, they do.
>
> > Or can we return an error on timekeeping_suspended like we do
> > w/ __getnstimeofday64()?
>
> No, we can't.
>
> > Also, what exactly is the case when the clocksource being read isn't
> > accessible? I see this is conditionalized on
> > CLOCK_SOURCE_SUSPEND_NONSTOP, so is the concern on resume we read the
> > clocksource and its been reset causing a crazy time value?
>
> The clocksource's ->suspend method may have been called (during suspend)
> and depending on what that did we may even crash things theoretically.
>
> During resume, before the clocksource's ->resume callback, it may just
> be undefined behavior (random data etc).
>
> For system suspend as we have today the window is quite narrow, but after
> patch [4/6] from this series suspend-to-idle may suspend timekeeping and
> just sit there in idle for extended time (hours even) which broadens the
> potential exposure quite a bit.
>
> Of course, it does that with interrupts disabled, but ktime_get_mono_fast_ns()
> is for NMI, so theoretically, if an NMI happens while we're in suspend-to-idle
> with timekeeping suspended and the clocksource is not CLOCK_SOURCE_SUSPEND_NONSTOP
> and the NMI calls ktime_get_mono_fast_ns(), strange and undesirable things may
> happen.
>
> > > Prevent that from happening by setting up a dummy readout base for
> > > the fast timekeeper during timekeeping_suspend() such that it will
> > > always return the same number of cycles.
> > >
> > > After the last timekeeping_update() in timekeeping_suspend() the
> > > clocksource is read and the result is stored as cycles_at_suspend.
> > > The readout base from the current timekeeper is copied onto the
> > > dummy and the ->read pointer of the dummy is set to a routine
> > > unconditionally returning cycles_at_suspend. Next, the dummy is
> > > passed to update_fast_timekeeper().
> > >
> > > Then, ktime_get_mono_fast_ns() will work until the subsequent
> > > timekeeping_resume() and the proper readout base for the fast
> > > timekeeper will be restored by the timekeeping_update() called
> > > right after clearing timekeeping_suspended.
> > >
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > > ---
> > >Â kernel/time/timekeeping.c |ÂÂ 22 ++++++++++++++++++++++
> > >Â 1 file changed, 22 insertions(+)
> > >
> > > Index: linux-pm/kernel/time/timekeeping.c
> > > ===================================================================
> > > --- linux-pm.orig/kernel/time/timekeeping.c
> > > +++ linux-pm/kernel/time/timekeeping.c
> > > @@ -1249,9 +1249,23 @@ static void timekeeping_resume(void)
> > >ÂÂÂÂÂÂÂÂ hrtimers_resume();
> > >Â }
> > >
> > > +/*
> > > + * Dummy readout base and suspend-time cycles value for the fast timekeeper to
> > > + * work in a consistent way after timekeeping has been suspended if the core
> > > + * timekeeper clocksource is not suspend-nonstop.
> > > + */
> > > +static struct tk_read_base tkr_dummy;
> > > +static cycle_t cycles_at_suspend;
> > > +
> > > +static cycle_t dummy_clock_read(struct clocksource *cs)
> > > +{
> > > +ÂÂÂÂÂÂ return cycles_at_suspend;
> > > +}
> > > +
> > >Â static int timekeeping_suspend(void)
> > >Â {
> > >ÂÂÂÂÂÂÂÂ struct timekeeper *tk = &tk_core.timekeeper;
> > > +ÂÂÂÂÂÂ struct clocksource *clock = tk->tkr.clock;
> > >ÂÂÂÂÂÂÂÂ unsigned long flags;
> > >ÂÂÂÂÂÂÂÂ struct timespec64ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ delta, delta_delta;
> > >ÂÂÂÂÂÂÂÂ static struct timespec64ÂÂÂÂÂÂÂ old_delta;
> > > @@ -1294,6 +1308,14 @@ static int timekeeping_suspend(void)
> > >ÂÂÂÂÂÂÂÂ }
> > >
> > >ÂÂÂÂÂÂÂÂ timekeeping_update(tk, TK_MIRROR);
> > > +
> > > +ÂÂÂÂÂÂ if (!(clock->flags & CLOCK_SOURCE_SUSPEND_NONSTOP)) {
> > > +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ memcpy(&tkr_dummy, &tk->tkr, sizeof(tkr_dummy));
> > > +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ cycles_at_suspend = tk->tkr.read(clock);
> > > +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ tkr_dummy.read = dummy_clock_read;
> > > +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ update_fast_timekeeper(&tkr_dummy);
> > > +ÂÂÂÂÂÂ }
> >
> > Its a little ugly... though I'm not sure I have a better idea right off.
> >
> > thanks
> > -john
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/