Re: [PATCH 1/3] hwmon: spd5118: Do not fail resume on temporary I2C errors
From: Guenter Roeck
Date: Fri Jan 16 2026 - 01:24:44 EST
On 1/15/26 05:50, TINSAE TADESSE wrote:
On Wed, Jan 14, 2026 at 5:23 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
On 1/14/26 05:07, TINSAE TADESSE wrote:
...
Please do not drop mailing lists from replies.Hi Guenter,
I tested changing the i801 SMBus controller to use
SET_LATE_SYSTEM_SLEEP_PM_OPS() instead of
DEFINE_SIMPLE_DEV_PM_OPS() as a diagnostic experiment. With this
change, spd5118 resume failures (-ENXIO)
still persist, suggesting PM ordering alone is insufficient and other
firmware interactions are involved.
How about the problem in the suspend function ? Is that also still seen ?
Also, the subject talks about -EIO. Is that still seen ?
Either case, can you enable debug logs for the i801 driver ?
It should generate log entries when it reports errors.
Thanks,
Guenter
Hi Guenter,
Thank you for the questions. To clarify:
1) I have not observed any failures in the suspend path. The suspend
callback completes successfully, and
I have not seen I2C errors or warnings during suspend at any point.
Sorry, I seem to be missing something.
In that case, what is the point of patch 3/3 of your series which
removes hardware accesses from the suspend function ?
2) I have also not observed -EIO in my testing. The error consistently
reported on resume and subsequent hwmon access is -ENXIO.
Earlier references to -EIO were based on assumptions rather than
observed logs, and I should have been clearer about that.
Thanks for the clarification.
Guenter
I am enabling debug logging for the i801 driver to collect more
concrete evidence of controller state during resume.
Hi Guenter,
Sorry, I seem to be missing something.
In that case, what is the point of patch 3/3 of your series which
removes hardware accesses from the suspend function ?
You are right to question this, and I agree that it needs clarification.
Patch 3/3 was originally proposed under the assumption that the resume failures
were caused by spd5118 performing I2C transactions while the
controller was not yet available,
and that removing hardware accesses from the suspend path might
mitigate the issue.
At that point, I assumed the problem was limited to the resume callback.
After enabling detailed i801 debug logging and testing with
SET_LATE_SYSTEM_SLEEP_PM_OPS() in the i801 driver,
it became clear that this assumption was incorrect. The controller
itself reports "i801_smbus: No response"
both during suspend and immediately after resume, and spd5118 merely
propagates the resulting -ENXIO.
Outch, that really hurts, because it means that something is seriously
broken in both the suspend and resume path. The device _must_ be accessible
in the suspend path. Otherwise there is no guarantee that the device is
accessible for normal (pre-suspend) operation. After all, someone could
run a script reading sysfs attributes in a tight loop continuously,
or the thermal subsystem could try to access the chip. That would suddenly
start to fail if something in the device access path starts to be suspended
while the underlying hardware is still believed to be operational.
I could imagine some hack/quirk for the resume path, such as delaying resume
for some period of time for affected hardware, but I have no idea what to
do on the suspend side. We can not just drop device writes during suspend
because some broken hardware/firmware does not let us actually access
(and thus suspend) the hardware anymore by the time the suspend function
is called.
Guenter
This indicates that the issue is not caused by spd5118 suspend/resume
behavior, but by the unavailability of the
SMBus controller due to platform or firmware interactions during
s2idle transitions.
Given this, I agree that patch 3/3 does not address the root cause and
does not provide a justified improvement.
I am therefore fine with dropping it.
Thank you for pointing this out.