Re: [lm-sensors] 3.13.?: Strange / dangerous fan policy...

From: Jean Delvare
Date: Sat Mar 08 2014 - 06:08:47 EST


On Fri, 7 Mar 2014 14:52:30 -0800, Guenter Roeck wrote:
> On Fri, Mar 07, 2014 at 11:04:29PM +0100, Manuel Krause wrote:
> > Hi, and thanks for the quick response!
> > No special fancy "fan control policy". 'fancontrol' isn't up or
> > running.
> > Vanilla kernels 3.11.* and 3.12.* had been working on here without
> > any extra work.
> > --
> > # sensors
> > acpitz-virtual-0
> > Adapter: Virtual device
> > temp1: +71.0°C (crit = +256.0°C)
> > temp2: +69.0°C (crit = +110.0°C)
> > temp3: +52.0°C (crit = +105.0°C)
> > temp4: +25.0°C (crit = +110.0°C)
> > temp5: +58.0°C (crit = +110.0°C)
> >
> > coretemp-isa-0000
> > Adapter: ISA adapter
> > Core 0: +62.0°C (high = +105.0°C, crit = +105.0°C)
> > Core 1: +60.0°C (high = +105.0°C, crit = +105.0°C)
> > --
> > My notebook (HP/Compaq 6730b) does not have a seperate fan sensor.
> > This is with 3.12.13 with my normal workload.
> >
> > Please, trust my above mentionned values of 94 °C vs. 74°C as I
> > don't like to boot 3.13.6 anymore, to avoid harm to the notebook's
> > casing.
>
> Understood. Unfortunately, we'll need to get information
> from the new kernel to be able to track down the problem.

Indeed. Not only the run-time temperatures, but also the high and crit
limits.

> > But I'd do to test any improvement-patch.
>
> So far I have no idea what is going on. I don't see anything in the
> drivers providing above data that would explain the behavior,
> but I might be missing something.

Looks like a regression in the acpi subsystem or in power management,
not hwmon. Hwmon is merely reporting the temperatures, it's not
responsible for the actual temperatures.

A bisection would certainly help, but of course that would require
booting to a bad kernel half of the time, which I understand Manual
wouldn't enjoy.

The only two components which I think can reach such high temperatures
in a laptop are the CPU and the GPU. I suppose that the "94 °C vs.
74°C" refers to acpitz's temp1? If the the temperatures reported by
coretemp remain the same, then I can only suppose that temp1 is the GPU
temperature. Please tell us which GPU is in this laptop, and which
driver you're using.

--
Jean Delvare
SUSE L3 Support
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/