[PATCH 4.18 188/228] amdgpu: fix multi-process hang issue

From: Greg Kroah-Hartman
Date: Tue Oct 02 2018 - 09:30:35 EST


4.18-stable review patch. If anyone has any objections, please let me know.

------------------

From: Emily Deng <Emily.Deng@xxxxxxx>

[ Upstream commit 2f40c6eac74a2a60921cdec9e9a8a57e88e31434 ]

SWDEV-146499: hang during multi vulkan process testing

cause:
the second frame's PREAMBLE_IB have clear-state
and LOAD actions, those actions ruin the pipeline
that is still doing process in the previous frame's
work-load IB.

fix:
need insert pipeline sync if have context switch for
SRIOV (because only SRIOV will report PREEMPTION flag
to UMD)

Signed-off-by: Monk Liu <Monk.Liu@xxxxxxx>
Signed-off-by: Emily Deng <Emily.Deng@xxxxxxx>
Reviewed-by: Christian KÃnig <christian.koenig@xxxxxxx>
Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -164,8 +164,10 @@ int amdgpu_ib_schedule(struct amdgpu_rin
return r;
}

+ need_ctx_switch = ring->current_ctx != fence_ctx;
if (ring->funcs->emit_pipeline_sync && job &&
((tmp = amdgpu_sync_get_fence(&job->sched_sync, NULL)) ||
+ (amdgpu_sriov_vf(adev) && need_ctx_switch) ||
amdgpu_vm_need_pipeline_sync(ring, job))) {
need_pipe_sync = true;
dma_fence_put(tmp);
@@ -196,7 +198,6 @@ int amdgpu_ib_schedule(struct amdgpu_rin
}

skip_preamble = ring->current_ctx == fence_ctx;
- need_ctx_switch = ring->current_ctx != fence_ctx;
if (job && ring->funcs->emit_cntxcntl) {
if (need_ctx_switch)
status |= AMDGPU_HAVE_CTX_SWITCH;