Re: 3.13.?: Strange / dangerous fan policy...

From: Guenter Roeck
Date: Sat Apr 05 2014 - 22:44:52 EST


On 04/05/2014 07:37 PM, Manuel Krause wrote:
On 2014-04-01 01:47, Guenter Roeck wrote:
On 03/31/2014 04:37 PM, Manuel Krause wrote:
On 2014-03-20 21:21, Manuel Krause wrote:
On 2014-03-11 22:59, Manuel Krause wrote:
On 2014-03-10 02:49, Manuel Krause wrote:
On 2014-03-09 18:58, Rafael J. Wysocki wrote:
On Sunday, March 09, 2014 01:10:25 AM Manuel Krause wrote:
On 2014-03-08 16:59, Guenter Roeck wrote:
On 03/08/2014 03:08 AM, Jean Delvare wrote:
On Fri, 7 Mar 2014 14:52:30 -0800, Guenter Roeck wrote:
On Fri, Mar 07, 2014 at 11:04:29PM +0100, Manuel Krause
wrote:
[SNIP]

Long time no reply from you... Have I overseen a unwritten
convention? Or were my charts that unusable for your
analysis/work?

Two days ago, I tried the 3.14.0-rc7-vanilla. And the problem
persists. "Strange / dangerous fan policy..."

Since kernel 3.13.6 I've managed to 'fix' the potential
overheating problem by manually issuing a:
"echo 1 > /sys/class/thermal/cooling_device3/cur_state" *)
_before_ obviously critical temperatures occur. Remind: This
particular setting may only work for my system! ...and keeps
working for 3.14-rc.

In the following I'd like to present you a modified output of my
/sys/class/thermal, that I've written a script for (for my
system), that shows the results in the way of
linux/Documentation/thermal/sysfs-api.txt, point 3:
{I've uploded the files to pastebin, to not swamp you and the
lists with so many lines of logs.}

For the last good kernel -- 3.12.14 -- in-use:
http://pastebin.com/HL1PNcda
For my first bad kernel revision 3.13 -- at critical temp:
http://pastebin.com/98hgf1a9
For the last bad kernel -- 3.14.0-rc7 -- at critical temp:
http://pastebin.com/MuTwTnjD
For the last bad kernel -- 3.14.0-rc7 -- after issuing the
*) command:
http://pastebin.com/2peda54z

Please, have a look at them! And maybe, give me hints on how I
can help you to further debug this issue, as my manual method
works but it's annoying.

And, PLEASE CC: ME, as I'm not on the lists. Or lead this
Email-thread to someone in charge.

Thank you for your work && best regards,
Manuel Krause


This is still BUG 71711
https://bugzilla.kernel.org/show_bug.cgi?id=71711

3.12.15 works very well
3.13.7 fails
3.14.0-rc8 fails


Best you can do would really be to bisect the problem.
Unfortunately only you (or someone else with an affected system)
can do that. Once the culprit is known it would be much easier
to get it fixed.

To answer your earlier question: I don't think you did anything
wrong.
I guess everyone else is just as clueless as I am (if not, speak up
and help ;-).

Guenter


I've now bisected two times. From two different kernel origins, just to be sure, as I'm new to this stupid-and-lengthy method, and, to be sure, I haven't given a false positive inbetween due to boredom.


Not really. Keep in mint that you were able to track down the bad commit
among more than 10,000 commits in a reasonably short period of time.

In the end it says each time:
# git bisect bad | tee -a /var/log/bisect.log
cc8ef52707341e67a12067d6ead991d56ea017ca is the first bad commit
commit cc8ef52707341e67a12067d6ead991d56ea017ca
Author: Zhang Rui <rui.zhang@xxxxxxxxx>
Date: Wed Sep 25 20:39:45 2013 +0800

ACPI / AC: convert ACPI ac driver to platform bus

Signed-off-by: Zhang Rui <rui.zhang@xxxxxxxxx>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>

Off to the two of you...

Guenter

:040000 040000 5a0d397cfcbf53c03390f2805b83754cb7837d84 4a2af1454f65d67f1d1a507c08e3b9ef3ffe57e7 M drivers


Please help me, on how I can help debug this more, and please also read the newest from
https://bugzilla.kernel.org/show_bug.cgi?id=71711

Manuel Krause




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/