Re: [PATCH V5 09/10] accel/amdxdna: Add error handling

From: Jeffrey Hugo
Date: Fri Oct 25 2024 - 13:49:09 EST


On 10/21/2024 10:19 AM, Lizhi Hou wrote:
When there is a hardware error, the NPU firmware notifies the host through
a mailbox message. The message includes details of the error, such as the
tile and column indexes where the error occurred.

The driver starts a thread to handle the NPU error message. The thread
stops the clients which are using the column where error occurred. Then
the driver resets that column.

Co-developed-by: Min Ma<min.ma@xxxxxxx>
Signed-off-by: Min Ma<min.ma@xxxxxxx>
Signed-off-by: Lizhi Hou<lizhi.hou@xxxxxxx>

Reviewed-by: Jeffrey Hugo <quic_jhugo@xxxxxxxxxxx>