Re: [PATCH] hwmon: (peci/dimmtemp) Do not provide fake thresholds data

From: Guenter Roeck
Date: Mon Jan 27 2025 - 12:29:51 EST


On 1/27/25 08:40, Winiarska, Iwona wrote:
On Thu, 2025-01-23 at 15:20 +0300, Paul Fertser wrote:
When an Icelake or Sapphire Rapids CPU isn't providing the maximum and
critical thresholds for particular DIMM the driver should return an
error to the userspace instead of giving it stale (best case) or wrong
(the structure contains all zeros after kzalloc() call) data.

The issue can be reproduced by binding the peci driver while the host is
fully booted and idle, this makes PECI interaction unreliable enough.

Fixes: 73bc1b885dae ("hwmon: peci: Add dimmtemp driver")
Fixes: 621995b6d795 ("hwmon: (peci/dimmtemp) Add Sapphire Rapids support")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Paul Fertser <fercerpav@xxxxxxxxx>

Hi!

Thank you for the patch.
Did you have a chance to test it with OpenBMC dbus-sensors?
In general, the change looks okay to me, but since it modifies the behavior
(applications will need to handle this, and returning an error will happen more
often) we need to confirm that it does not cause any regressions for userspace.


I would also like to understand if the error is temporary or permanent.
If it is permanent, the attributes should not be created in the first
place. It does not make sense to have limit attributes which always report
-ENODATA.

Guenter