RE: [PATCH][v2] timekeeping: Fix memory overwrite of sleep_time_bin array

From: Chen, Yu C
Date: Wed Jul 20 2016 - 12:59:51 EST


Hi,
> -----Original Message-----
> From: Rafael J. Wysocki [mailto:rjw@xxxxxxxxxxxxx]
> Sent: Wednesday, July 20, 2016 9:00 PM
> To: Chen, Yu C
> Cc: Thomas Gleixner; John Stultz; Linux PM; Linux Kernel Mailing List
> Subject: Re: [PATCH][v2] timekeeping: Fix memory overwrite of sleep_time_bin
> array
>
> On Wednesday, July 20, 2016 07:06:58 PM Chen Yu wrote:
> > Hi Thomas,
> > On Tue, Jul 19, 2016 at 12:40:14PM +0200, Thomas Gleixner wrote:
> > > On Tue, 19 Jul 2016, Chen Yu wrote:
> > > > On 2016å07æ19æ 16:36, Thomas Gleixner wrote:
> > > > > On Tue, 19 Jul 2016, Chen Yu wrote:
> > > > > > Further investigation shows that, the problem is caused by
> > > > > > setting /sys/power/pm_trace to 1 before the 1st hibernation,
> > > > > > since once pm_trace is enabled, the rtc becomes an
> > > > > > unmeaningful value after resumed,
> > > > >
> > > > > So why is the RTC value useless if pm_trace is enabled? I really
> > > > > have a hard time to understand why pm_trace would affect the
> > > > > sleep time readout from RTC.
> > > >
> > > > After pm_trace is enabled, during system suspend/hibernate, the
> > > > hash name of each devices will be written to rtc, so the rtc value
> > > > depends on what we write in last suspend round, thus pm_trace can
> > > > be used for diagnose which device failed to suspend(eg, the
> > > > suspending on this device hang the system, we reboot the system , and
> check rtc hash value).
> > > >
> > > > In our case, after first hibernate/resume round, we found our
> > > > current system time is at 2117, so syscore_resume ->
> timekeeping_resume :
> > > > __timekeeping_inject_sleeptime(tk, &ts_delta) would inject a quite
> > > > large delta : 2117 - 2017 year, thus the sleep_time_bin is overflow.
> > >
> > > While the range check is certainly correct and a good thing to have
> > > it's wrong in the first place to call
> > > __timekeeping_inject_sleeptime() in case that pm_trace is enabled
> > > simply because that "hash" time value will also wreckage
> > > timekeeping. Your patch is just curing the symptom in the debug code but
> not fixing the root cause.
> > >
> > OK. I've modified the patch.
> > In case I break any other stuff :p, could you help check if this patch
> > is in the right direction, thanks:
> >
> > 1. There are two places would invoke __timekeeping_inject_sleeptime(),
> > they are timekeeping_resume and rtc_resume, so we need to deal with
> > them respctively.
> >
> > 2. for rtc_resume, if the pm_trace has once been enabled,
> > we bypass the injection of sleep time.
> >
> > 3. for timekeeping_resume,
> > Currently we either use nonstop clock source, or use persistent
> > clock to get the sleep time. As pm_trace breaks systems who use rtc
> > as a persistent clock, x86 is affected. So we add a
> > check for x86 that, if the pm_trace has been enabled, we can not
> > trust the persistent clock delta read from rtc, thus bypass
> > the injection of sleep time in this case.
> >
> > 4. Why we checked the history of pm_trace: once pm_trace
> > has been enabled, the delta of rtc would not be reliable anymore.
> > For example, if we only check current pm_trace, we might still get
> > memory overwrite:
> >
> > 4.1 echo 1 > /sys/power/pm_trace
> > 4.2 hibernate/resume (rtc is broken, do not add delta from rtc because
> pm_trace is 1)
> > 4.3 echo 0 > /sys/power/pm_trace
> > 4.4 hibernate/resume (rtc is still broken, but add delta from rtc
> > because pm_trace is 0)
>
> The initial state of the RTC is invalid, but will the delta be still invalid?
>
According to feedback from the bug reporter,
with previous patch applied which uses pm_trace_is_enabled() directly in
resume, then after several rounds, once the pm_trace is disabled,
the following hibernate/resume would bring a overflow sleep delta.
It looks like the delta also considers the historical(previous) delta,
so I added the history pm_once_trace . I'll confirm and give feedback later.

> And what if the admin fixes up the RTC before hibernating? You will still
> discard the RTC delta until the next reboot, right?
Yes, it will be discarded, I agree we should not bypass the delta if someone has
fixed the rtc, I'll dig into the code.

Thanks,
Yu