[PATCH 11/16] habanalabs/gaudi: fetch TPC/MME ECC errors from F/W

From: Oded Gabbay
Date: Wed Aug 18 2021 - 09:40:25 EST


From: Ofir Bitton <obitton@xxxxxxxxx>

In case F/W security is enabled driver cannot access ECC registers,
hence driver must fetch the ECC info from F/W.

Signed-off-by: Ofir Bitton <obitton@xxxxxxxxx>
Reviewed-by: Oded Gabbay <ogabbay@xxxxxxxxxx>
Signed-off-by: Oded Gabbay <ogabbay@xxxxxxxxxx>
---
drivers/misc/habanalabs/gaudi/gaudi.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
index 2f6d019c79e1..6671d1aca8e1 100644
--- a/drivers/misc/habanalabs/gaudi/gaudi.c
+++ b/drivers/misc/habanalabs/gaudi/gaudi.c
@@ -7458,6 +7458,11 @@ static void gaudi_handle_ecc_event(struct hl_device *hdev, u16 event_type,
bool extract_info_from_fw;
int rc;

+ if (hdev->asic_prop.fw_security_enabled) {
+ extract_info_from_fw = true;
+ goto extract_ecc_info;
+ }
+
switch (event_type) {
case GAUDI_EVENT_PCIE_CORE_SERR ... GAUDI_EVENT_PCIE_PHY_DERR:
case GAUDI_EVENT_DMA0_SERR_ECC ... GAUDI_EVENT_MMU_DERR:
@@ -7530,6 +7535,7 @@ static void gaudi_handle_ecc_event(struct hl_device *hdev, u16 event_type,
return;
}

+extract_ecc_info:
if (extract_info_from_fw) {
ecc_address = le64_to_cpu(ecc_data->ecc_address);
ecc_syndrom = le64_to_cpu(ecc_data->ecc_syndrom);
--
2.17.1