[PATCH 5.1 284/405] habanalabs: prevent CPU soft lockup on Palladium

From: Greg Kroah-Hartman
Date: Thu May 30 2019 - 00:48:57 EST


[ Upstream commit e850b89f50d2c1439f58d547b888ee6e43312dea ]

Unmapping ptes in the device MMU on Palladium can take a long time, which
can cause a kernel BUG of CPU soft lockup.

This patch minimize the chances for this bug by sleeping a little between
unmapping ptes.

Signed-off-by: Oded Gabbay <oded.gabbay@xxxxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
drivers/misc/habanalabs/memory.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/habanalabs/memory.c b/drivers/misc/habanalabs/memory.c
index ce1fda40a8b81..fadaf557603f5 100644
--- a/drivers/misc/habanalabs/memory.c
+++ b/drivers/misc/habanalabs/memory.c
@@ -1046,10 +1046,17 @@ static int unmap_device_va(struct hl_ctx *ctx, u64 vaddr)

mutex_lock(&ctx->mmu_lock);

- for (i = 0 ; i < phys_pg_pack->npages ; i++, next_vaddr += page_size)
+ for (i = 0 ; i < phys_pg_pack->npages ; i++, next_vaddr += page_size) {
if (hl_mmu_unmap(ctx, next_vaddr, page_size))
dev_warn_ratelimited(hdev->dev,
- "unmap failed for vaddr: 0x%llx\n", next_vaddr);
+ "unmap failed for vaddr: 0x%llx\n", next_vaddr);
+
+ /* unmapping on Palladium can be really long, so avoid a CPU
+ * soft lockup bug by sleeping a little between unmapping pages
+ */
+ if (hdev->pldm)
+ usleep_range(500, 1000);
+ }

hdev->asic_funcs->mmu_invalidate_cache(hdev, true);

--
2.20.1