Re: k10temp: ZEN3 readings are broken

From: Guenter Roeck
Date: Tue Dec 22 2020 - 10:52:14 EST


On 12/22/20 7:26 AM, Gabriel C wrote:
> Am Di., 22. Dez. 2020 um 07:16 Uhr schrieb Guenter Roeck <linux@xxxxxxxxxxxx>:
>>
>> On Tue, Dec 22, 2020 at 05:33:17AM +0100, Gabriel C wrote:
>> [ ... ]
>>> At least is what the weird amd_energy driver added and since is only supporting
>>> fam 17h model 0x31 which is TR 3000 & SP3 Rome, I guess fam 19h 0x1 is
>>> TR/SP3 ZEN3.
>>
>> The limited model support is because people nowadays are not willing to
>> accept that reported values may not always be perfect ... and the reported
>> energy for non-server parts is known to be not always perfect. Kind of an
>> odd situation: If we support non-server parts, we have people complain
>> that values are not perfect. If we only support server parts, we have
>> people complain that only server parts are supported. For us, that is
>> a lose-lose situation. I used to think that is is better to report
>> _something_, but the (sometimes loud) complaints about lack of perfection
>> teached me a lesson. So now my reaction is to drop support if I get
>> complaints about lack of perfection.
>>
>
> I agree it is an odd situation with these modules, but having
> something is better than nothing.

That is your opinion, and it used to be mine as well. As I said, I have
learned from the feedback.

> As for the amd_energy driver, yes it is off on some platforms by 2%-5%
> or alike but without having
> that support in the kernel, regardless of the module, we cannot ever
> come to perfection or near it.
>
> For both k10temp & amd_energy driver I suggest to not drop the support
> but add kernel modules
> options disabled by default, much like a lot laptop platform drivers
> have for different reasons.
>

That would just add complexity for little gain. The code would still have
to be maintained, and as experience (and the out-of-tree driver) has shown
this is a never ending story. Plus, it would still be inaccurate, leading
to complaints, module parameter or not.

> The amd_energy driver could have some any_ryzen option which turned
> off by default.
> That way people may decide if they want to use it even when not 100%
> perfect and can report
> back on platforms the reporting is accurate.
> Waiting for AMD to give us ID of what may be in their eyes accurate is
> like waiting for pigs to fly.
>
> The k10temp module much like the same, some experimental_voltage_report module
> option will be fine for now, I think.
>
> I'm also sure owner of AMD HW will help out optimizing and maintaining the code.
>

Not really. My experience is that almost everyone will just complain.
It was a bad idea to add voltage/current reporting to the k10temp driver,
and it is time to revert it. If someone else wants to write (and maintain)
a separate amd_voltage or similar driver, I am all open to accept it.

Note that even you suggested to _drop_ the amd energy driver instead of
fixing it. I'll take that as a qed.

Guenter