[PATCH 4.4 187/342] drm/amdgpu: hold reference to fences in amdgpu_sa_bo_new (v2)

From: Greg Kroah-Hartman
Date: Tue Mar 01 2016 - 19:51:46 EST


4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nicolai HÃhnle <nicolai.haehnle@xxxxxxx>

commit a8d81b36267366603771431747438d18f32ae2d5 upstream.

An arbitrary amount of time can pass between spin_unlock and
fence_wait_any_timeout, so we need to ensure that nobody frees the
fences from under us.

A stress test (rapidly starting and killing hundreds of glxgears
instances) ran into a deadlock in fence_wait_any_timeout after
about an hour, and this race condition appears to be a plausible
cause.

v2: agd: rebase on upstream

Signed-off-by: Nicolai HÃhnle <nicolai.haehnle@xxxxxxx>
Reviewed-by: Alex Deucher <alexander.deucher@xxxxxxx>
Reviewed-by: Christian KÃnig <christian.koenig@xxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
@@ -354,12 +354,15 @@ int amdgpu_sa_bo_new(struct amdgpu_sa_ma

for (i = 0, count = 0; i < AMDGPU_MAX_RINGS; ++i)
if (fences[i])
- fences[count++] = fences[i];
+ fences[count++] = fence_get(fences[i]);

if (count) {
spin_unlock(&sa_manager->wq.lock);
t = fence_wait_any_timeout(fences, count, false,
MAX_SCHEDULE_TIMEOUT);
+ for (i = 0; i < count; ++i)
+ fence_put(fences[i]);
+
r = (t > 0) ? 0 : t;
spin_lock(&sa_manager->wq.lock);
} else {