Re: [PATCH] i8k: Add support for temperature sensor labels

From: Guenter Roeck
Date: Sun Nov 30 2014 - 11:04:27 EST


On 11/30/2014 02:11 AM, Pali RohÃr wrote:
On Sunday 30 November 2014 02:25:09 Guenter Roeck wrote:
On 11/29/2014 11:07 AM, Pali RohÃr wrote:
[ ... ]

Original Dell DOS executable ignores all temperature sensors
if type SMM function fails (if I decoded and understand
that DOS assembler code correctly). So maybe we should do
same...

Pali,

Makes me wonder - does the assembler code tell you what to do
if the reported temperature is invalid, and does it
distinguish between error codes ?

I do not see anything like that. But there are lot of indirect
calls (offset to pointer to function is stored in some global at
init zero data), so it is hard to understand what that DOS binary
is doing. I'm happy that I decoded loop which trying to call that
type function and if it does not fail then it call read
temperature function. And in that section I do not see any error
handling of invalid values (but it could be somewhere else).

Anyway DOS binary is quite old (7 years maybe?). It is not even
available for my last E6440 model. Now all new Dell laptops have
EFI system and ePSA application (new version of diagnostic tool
which reports info about fan, temperature, ...). That tool looks
like is burned directly into machine (I can start it with empty
HDD from Setup screen) or into BIOS image.

And what is interesting about this ePSA:

* it show more temperature sensors (battery temperature)
* it show correct RPM of fan and *can* control fan speed

I think that DOS binary has no idea about Optimus or PowerExpress
cards so for that error handling we need to understand what is
doing new EFI ePSA application...

And because function for turning card on/off is controlled via
ACPI I bet that DOS or EFI application does not touch it, so
assume that card is always on and does not need any error
handling.

Another info about DOS binary: After SMM code for reading fan RPM
is finished, then function divide returned RPM value by some
number stored in local data. So now I think that magic fan
multiplier is not constant, but runtime value. I will try to look
at it, if we can fix this problem in linux i8k.c.

It might be system specific. After all, it is known that old laptops
need a different multiplier.

So far we have
0x99 - presumably a spurious error
0xc1 - GPU temperature sensor, GPU turned off

It would be nice if we could find a better solution for error
handling.


Yes, but now we can only guess... My idea is that Dell SMM
handler does not check GPU presence at runtime and just try to
read info from PCI bus. And because turned off card is not there
just random (or non random) garbage is returned...

Well, it was worth a try. It might be that, or SMM does handle it,
but that the DOS application is too old to understand it.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/