Re: [PATCH] drm/amdgpu: Remove GC HW IP 9.3.0 from noretry=1

From: Christian König
Date: Fri May 17 2024 - 02:35:22 EST


Am 16.05.24 um 19:57 schrieb Tim Van Patten:
From: Tim Van Patten <timvp@xxxxxxxxxx>

The following commit updated gmc->noretry from 0 to 1 for GC HW IP
9.3.0:

commit 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1")

This causes the device to hang when a page fault occurs, until the
device is rebooted. Instead, revert back to gmc->noretry=0 so the device
is still responsive.

Wait a second. Why does the device hang on a page fault? That shouldn't happen independent of noretry.

So that strongly sounds like this is just hiding a bug elsewhere.

Regards,
Christian.


Fixes: 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1")
Signed-off-by: Tim Van Patten <timvp@xxxxxxxxxx>
---

drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index be4629cdac049..bff54a20835f1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -876,7 +876,6 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev)
struct amdgpu_gmc *gmc = &adev->gmc;
uint32_t gc_ver = amdgpu_ip_version(adev, GC_HWIP, 0);
bool noretry_default = (gc_ver == IP_VERSION(9, 0, 1) ||
- gc_ver == IP_VERSION(9, 3, 0) ||
gc_ver == IP_VERSION(9, 4, 0) ||
gc_ver == IP_VERSION(9, 4, 1) ||
gc_ver == IP_VERSION(9, 4, 2) ||