Re: [PATCH] accel/amdxdna: Treat power-off failure as unrecoverable error
From: Lizhi Hou
Date: Fri Nov 07 2025 - 11:57:24 EST
Applied to drm-misc-next.
On 11/6/25 10:31, Mario Limonciello wrote:
On 11/6/25 12:19 PM, Lizhi Hou wrote:
On 11/6/25 10:12, Mario Limonciello wrote:
On 11/6/25 12:05 PM, Lizhi Hou wrote:
Failing to set power off indicates an unrecoverable hardware or firmware
error. Update the driver to treat such a failure as a fatal condition
and stop further operations that depend on successful power state
transition.
This prevents undefined behavior when the hardware remains in an
unexpected state after a failed power-off attempt.
Signed-off-by: Lizhi Hou <lizhi.hou@xxxxxxx>
Presumably all versions of hardware in the wild can handle receiving a power off command if they're already powered off?
Yes for the aie2 platforms. This was verified by xdna-driver pipeline tests.
OK LGTM then.
Reviewed-by: Mario Limonciello (AMD) <superm1@xxxxxxxxxx>
Lizhi
---
drivers/accel/amdxdna/aie2_smu.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/accel/amdxdna/aie2_smu.c b/drivers/accel/ amdxdna/aie2_smu.c
index 11c0e9e7b03a..bd94ee96c2bc 100644
--- a/drivers/accel/amdxdna/aie2_smu.c
+++ b/drivers/accel/amdxdna/aie2_smu.c
@@ -147,6 +147,16 @@ int aie2_smu_init(struct amdxdna_dev_hdl *ndev)
{
int ret;
+ /*
+ * Failing to set power off indicates an unrecoverable hardware or
+ * firmware error.
+ */
+ ret = aie2_smu_exec(ndev, AIE2_SMU_POWER_OFF, 0, NULL);
+ if (ret) {
+ XDNA_ERR(ndev->xdna, "Access power failed, ret %d", ret);
+ return ret;
+ }
+
ret = aie2_smu_exec(ndev, AIE2_SMU_POWER_ON, 0, NULL);
if (ret) {
XDNA_ERR(ndev->xdna, "Power on failed, ret %d", ret);