[PATCH 5.8 328/464] RDMA/hns: Fix the unneeded process when getting a general type of CQE error

From: Greg Kroah-Hartman
Date: Mon Aug 17 2020 - 11:41:24 EST


From: Xi Wang <wangxi11@xxxxxxxxxx>

[ Upstream commit 395f2e8fd340c5bfad026f5968b56ec34cf20dd1 ]

If the hns ROCEE reports a general error CQE (types not specified by the IB
General Specifications), it's no need to change the QP state to error, and
the driver should just skip it.

Fixes: 7c044adca272 ("RDMA/hns: Simplify the cqe code of poll cq")
Link: https://lore.kernel.org/r/1595932941-40613-8-git-send-email-liweihang@xxxxxxxxxx
Signed-off-by: Xi Wang <wangxi11@xxxxxxxxxx>
Signed-off-by: Weihang Li <liweihang@xxxxxxxxxx>
Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 9 +++++++++
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 1 +
2 files changed, 10 insertions(+)

diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 9833ce3e21f9e..eb71b941d21b7 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -3053,6 +3053,7 @@ static void get_cqe_status(struct hns_roce_dev *hr_dev, struct hns_roce_qp *qp,
IB_WC_RETRY_EXC_ERR },
{ HNS_ROCE_CQE_V2_RNR_RETRY_EXC_ERR, IB_WC_RNR_RETRY_EXC_ERR },
{ HNS_ROCE_CQE_V2_REMOTE_ABORT_ERR, IB_WC_REM_ABORT_ERR },
+ { HNS_ROCE_CQE_V2_GENERAL_ERR, IB_WC_GENERAL_ERR}
};

u32 cqe_status = roce_get_field(cqe->byte_4, V2_CQE_BYTE_4_STATUS_M,
@@ -3074,6 +3075,14 @@ static void get_cqe_status(struct hns_roce_dev *hr_dev, struct hns_roce_qp *qp,
print_hex_dump(KERN_ERR, "", DUMP_PREFIX_NONE, 16, 4, cqe,
sizeof(*cqe), false);

+ /*
+ * For hns ROCEE, GENERAL_ERR is an error type that is not defined in
+ * the standard protocol, the driver must ignore it and needn't to set
+ * the QP to an error state.
+ */
+ if (cqe_status == HNS_ROCE_CQE_V2_GENERAL_ERR)
+ return;
+
/*
* Hip08 hardware cannot flush the WQEs in SQ/RQ if the QP state gets
* into errored mode. Hence, as a workaround to this hardware
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
index e176b0aaa4ac2..e6c385ced1872 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
@@ -230,6 +230,7 @@ enum {
HNS_ROCE_CQE_V2_TRANSPORT_RETRY_EXC_ERR = 0x15,
HNS_ROCE_CQE_V2_RNR_RETRY_EXC_ERR = 0x16,
HNS_ROCE_CQE_V2_REMOTE_ABORT_ERR = 0x22,
+ HNS_ROCE_CQE_V2_GENERAL_ERR = 0x23,

HNS_ROCE_V2_CQE_STATUS_MASK = 0xff,
};
--
2.25.1