Re: [PATCH 00/11] reboot: support runtime configuration of emergency hw_protection action

From: Matti Vaittinen
Date: Sun Dec 22 2024 - 04:38:30 EST


On 19/12/2024 09:31, Ahmad Fatoum wrote:
We currently leave the decision of whether to shutdown or reboot to
protect hardware in an emergency situation to the individual drivers.

This works out in some cases, where the driver detecting the critical
failure has inside knowledge: It binds to the system management controller
for example or is guided by hardware description that defines what to do.

This is inadequate in the general case though as a driver reporting e.g.
an imminent power failure can't know whether a shutdown or a reboot would
be more appropriate for a given hardware platform.

Sometimes it can. There are platforms where the hardware is such we know that poweroff or reboot are the way to go. In such case the driver should get the information from the hardware description (like device-tree).

To address this, this series adds a hw_protection kernel parameter and
sysfs toggle that can be used to change the action from the shutdown
default to reboot. A new hw_protection_trigger API then makes use of
this default action.

My particular use case is unattended embedded systems that don't
have support for shutdown and that power on automatically when power is
supplied:

- A brief power cycle gets detected by the driver
- The kernel powers down the system and SoC goes into shutdown mode
- Power is restored
- The system remains oblivious to the restored power

This sounds like a consequence of a hardware design as restoring the power doesn't wake up the SoC(?)

- System needs to be manually power cycled for a duration long enough
to drain the capacitors

With this series, such systems can configure the kernel with
hw_protection=reboot to have the boot firmware worry about critical
conditions.

I am not against the change though. Just wondering if this is still a consequence of the hardware design, and if the device-tree would be proper place to indicate that poweroff shouldn't be used.

I'm about to leave my computer behind for holidays, so I am probably not able to do a proper review until the next year. Thus this quick comment :) Also, no strong opinion so I'm not expecting anyone to hold back waiting for me!

Good luck and happy holidays!
-- Matti

---
Ahmad Fatoum (11):
reboot: replace __hw_protection_shutdown bool action parameter with an enum
reboot: reboot, not shutdown, on hw_protection_reboot timeout
docs: thermal: sync hardware protection doc with code
reboot: rename now misleading hw_protection symbols
reboot: indicate whether it is a HARDWARE PROTECTION reboot or shutdown
reboot: add support for configuring emergency hardware protection action
regulator: allow user configuration of hardware protection action
platform/chrome: cros_ec_lpc: prepare for hw_protection_shutdown removal
dt-bindings: thermal: give OS some leeway in absence of critical-action
thermal: core: allow user configuration of hardware protection action
reboot: retire hw_protection_reboot and hw_protection_shutdown helpers

Documentation/ABI/testing/sysfs-kernel-reboot | 8 ++
Documentation/admin-guide/kernel-parameters.txt | 6 +
.../devicetree/bindings/thermal/thermal-zones.yaml | 5 +-
Documentation/driver-api/thermal/sysfs-api.rst | 25 +++--
drivers/platform/chrome/cros_ec_lpc.c | 2 +-
drivers/regulator/core.c | 4 +-
drivers/regulator/irq_helpers.c | 16 +--
drivers/thermal/thermal_core.c | 17 +--
drivers/thermal/thermal_core.h | 1 +
drivers/thermal/thermal_of.c | 7 +-
include/linux/reboot.h | 25 +++--
include/uapi/linux/capability.h | 1 +
kernel/reboot.c | 122 ++++++++++++++++-----
13 files changed, 173 insertions(+), 66 deletions(-)
---
base-commit: 78d4f34e2115b517bcbfe7ec0d018bbbb6f9b0b8
change-id: 20241218-hw_protection-reboot-96953493726a

Best regards,