"Core temperature above threshold" on Fujitsu U757 with 2 core Kaby Lake (i7-7600U)

From: Christoph Anton Mitterer
Date: Fri Oct 27 2017 - 21:34:00 EST


Hey.

Perhaps someone can help me with this.


I got a brand new notebook from the university, a Fujitsu U757[0][1],
with a 2 core Kaby Lake (i7-7600U) and 32GB RAM.
It runs Debian unstable, that is as of now kernel 4.13.4.

Even at pretty simple tasks (just some VM running) and a bit more, the
CPUs seem to overheat (>100ÂC).
I brought the thing back to the university's vendor and they claimed
that they couldn't reproduce this with the (Windows based) tests and it
might be a OS issue (they did replace the heat paste at my request).


The kernel logs quite regularly give:
Oct 28 03:15:19 heisenberg kernel: CPU2: Core temperature above threshold, cpu clock throttled (total events = 1207)
Oct 28 03:15:19 heisenberg kernel: CPU0: Core temperature above threshold, cpu clock throttled (total events = 1207)
Oct 28 03:15:19 heisenberg kernel: CPU1: Package temperature above threshold, cpu clock throttled (total events = 1394)
Oct 28 03:15:19 heisenberg kernel: CPU3: Package temperature above threshold, cpu clock throttled (total events = 1394)
Oct 28 03:15:19 heisenberg kernel: CPU0: Package temperature above threshold, cpu clock throttled (total events = 1394)
Oct 28 03:15:19 heisenberg kernel: CPU2: Package temperature above threshold, cpu clock throttled (total events = 1394)
Oct 28 03:15:19 heisenberg kernel: CPU0: Core temperature/speed normal
Oct 28 03:15:19 heisenberg kernel: CPU2: Core temperature/speed normal
Oct 28 03:15:19 heisenberg kernel: CPU3: Package temperature/speed normal
Oct 28 03:15:19 heisenberg kernel: CPU1: Package temperature/speed normal
Oct 28 03:15:19 heisenberg kernel: CPU2: Package temperature/speed normal
Oct 28 03:15:19 heisenberg kernel: CPU0: Package temperature/speed normal

I guess every time it goes beyond 100Â C.

Once so far I had a complete lockup of the machine (it still seemed to
write data to the HDD, but I could only hard power cycle to get
it usable again.
Not sure if this is related to the temperature issue.
See the attached kernel log.

At around Oct 15 22:46:39 there seems to be first a crash of the Wifi
microcode a bit later, beginning at about Oct 16 01:27:16, there are
numerous stack traces with "BUG: soft lockup - CPU".


Could this be some kernel issue? Especially the overheating... I mean
obviously not in the sense that it's the kernels fault, but in the
sense that is should speed it down earlier or so...?


Interestingly, when I run e.g. stress or stress-ng on all 4 logical
CPUs... then sometimes I do get the overheating, sometimes not (in
which case temperature stays above 90ÂC.. but always below 100ÂC (I
assume).



Any help would be welcome, do not hesitate to ask if you need more data
(keep me CCed).

Thanks,
Chris.



[0] http://www.fujitsu.com/fts/products/computing/pc/notebooks/lifebook-u757/
[1] http://docs.ts.fujitsu.com/dl.aspx?id=addf5093-b73b-407b-ae78-90c5baf6456a

Attachment: kern.log.xz
Description: application/xz