Re: TSC not updating after resume: Bug or Feature?

From: Fabio Comolli
Date: Mon Dec 22 2008 - 15:39:45 EST


Hi Thomas.

On Mon, Dec 22, 2008 at 9:36 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Mon, 22 Dec 2008, Thomas Gleixner wrote:
>> On Mon, 22 Dec 2008, Ingo Molnar wrote:
>> > * Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>> >
>> > > > By the way, I don't know if it matters but the problema happened with
>> > > > in-kernel hibernation and also in out-of-tree TuxOnIce hibernation.
>> > > > Maybe this can help debugging the issue, I don't know.
>> > >
>> > > Hmm, does not ring a bell here. Can you please apply the patch below to
>> > > mainline and retest ?
>> >
>> > ... and he should send a dmesg after a suspend cycle, right?
>>
>> Yes :)
>>
>> I digged more in the bugzillas. Toralf added some debug to
>> __update_sched_clock():
>>
>> - max_clock = wrap_max(scd->clock, scd->tick_gtod + TICK_NSEC);
>> + max_clock = scd->tick_gtod + TICK_NSEC;
>> + if (scd->clock > max_clock)
>> + printk(KERN_INFO "%d %d\n", scd->clock, max_clock);
>>
>> The interesting output is:
>>
>> Dec 14 21:55:55 n22 Back to C!
>> Dec 14 21:55:55 n22 Extended CMOS year: 2000
>> Dec 14 21:55:55 n22 Force enabled HPET at resume
>> Dec 14 21:55:55 n22 212611283 77
>>
>> The 77 is totaly bogus and it's likely not just a truncation of the
>> 64bit value because (scd->clock > max_clock) evaluates to true. This
>> output is _AFTER_ timekeeping resume because the HPET force enable
>> message comes from timekeeping resume.
>
> Seems I'm talking to myself, but I think I finally decoded the
> mystery:
>
> resume()
> cpufreq_resume()
> tsc:time_cpufreq_notifier()
> set_cyc2ns_scale()
> sched_clock_idle_sleep_event()
> sched_clock_tick()
> ktime_get()
> hpet_read()
>
> This happens _BEFORE_ timekeeping has resumed, so hpet_read() returns
> nonsense and the timekeeping code uses the stale pre suspend
> xtime/clocksource reference values to calculate the time. So the gtod
> reference in sched_clock can result in total crap depending on the
> time when the suspend happened.
>
> Shaggys patch clamps sched_clock to the stale scd->clock value which
> might explain the further wreckage.
>
> The above sequence happens only for CPUs with a CPU frequency coupled
> TSC, so on newer machines with CPU frequency invariant TSC this does
> not happen.
>
> /me stomps off to find a box to confirm that.
>

Ok, just for reference here is my

fcomolli@hawking:~> cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 13
model name : Intel(R) Pentium(R) M processor 1.73GHz
stepping : 8
cpu MHz : 800.000
cache size : 2048 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat clflush dts acpi mmx fxsr sse sse2 ss tm pbe nx up bts est tm2
bogomips : 1596.95
clflush size : 64
power management:



Regards,
Fabio
> Thanks,
>
> tglx
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/