Re: [PATCH][v2] timekeeping: Fix memory overwrite of sleep_time_bin array
From: Rafael J. Wysocki
Date: Wed Jul 20 2016 - 08:55:09 EST
On Wednesday, July 20, 2016 07:06:58 PM Chen Yu wrote:
> Hi Thomas,
> On Tue, Jul 19, 2016 at 12:40:14PM +0200, Thomas Gleixner wrote:
> > On Tue, 19 Jul 2016, Chen Yu wrote:
> > > On 2016å07æ19æ 16:36, Thomas Gleixner wrote:
> > > > On Tue, 19 Jul 2016, Chen Yu wrote:
> > > > > Further investigation shows that, the problem is caused by setting
> > > > > /sys/power/pm_trace to 1 before the 1st hibernation, since once
> > > > > pm_trace is enabled, the rtc becomes an unmeaningful value after resumed,
> > > >
> > > > So why is the RTC value useless if pm_trace is enabled? I really have a hard
> > > > time to understand why pm_trace would affect the sleep time readout from
> > > > RTC.
> > >
> > > After pm_trace is enabled, during system suspend/hibernate, the hash name of
> > > each devices will be written to rtc, so the rtc value depends on what we
> > > write in last suspend round, thus pm_trace can be used for diagnose which
> > > device failed to suspend(eg, the suspending on this device hang the system,
> > > we reboot the system , and check rtc hash value).
> > >
> > > In our case, after first hibernate/resume round, we found our current system
> > > time is at 2117, so syscore_resume -> timekeeping_resume :
> > > __timekeeping_inject_sleeptime(tk, &ts_delta) would inject a quite large
> > > delta : 2117 - 2017 year, thus the sleep_time_bin is overflow.
> > While the range check is certainly correct and a good thing to have it's wrong
> > in the first place to call __timekeeping_inject_sleeptime() in case that
> > pm_trace is enabled simply because that "hash" time value will also wreckage
> > timekeeping. Your patch is just curing the symptom in the debug code but not
> > fixing the root cause.
> OK. I've modified the patch.
> In case I break any other stuff :p, could you help check
> if this patch is in the right direction, thanks:
> 1. There are two places would invoke __timekeeping_inject_sleeptime(),
> they are timekeeping_resume and rtc_resume, so we need to deal with
> them respctively.
> 2. for rtc_resume, if the pm_trace has once been enabled,
> we bypass the injection of sleep time.
> 3. for timekeeping_resume,
> Currently we either use nonstop clock source, or use persistent
> clock to get the sleep time. As pm_trace breaks systems who use rtc
> as a persistent clock, x86 is affected. So we add a
> check for x86 that, if the pm_trace has been enabled, we can not
> trust the persistent clock delta read from rtc, thus bypass
> the injection of sleep time in this case.
> 4. Why we checked the history of pm_trace: once pm_trace
> has been enabled, the delta of rtc would not be reliable anymore.
> For example, if we only check current pm_trace, we might still get
> memory overwrite:
> 4.1 echo 1 > /sys/power/pm_trace
> 4.2 hibernate/resume (rtc is broken, do not add delta from rtc because pm_trace is 1)
> 4.3 echo 0 > /sys/power/pm_trace
> 4.4 hibernate/resume (rtc is still broken, but add delta from rtc because pm_trace is 0)
The initial state of the RTC is invalid, but will the delta be still invalid?
And what if the admin fixes up the RTC before hibernating? You will still discard
the RTC delta until the next reboot, right?