amdgpu hang on picasso

From: Ken Moffat
Date: Sat Oct 30 2021 - 15:32:04 EST


When I tried 5.15-rc7 on my picasso APU (Ryzen 5 3400G), trying to
run 'startx' (I'm using X11 and logging in to a tty) the output
messages from X11 stopped after a few lines (normally, the desktop
shows before I can read anything) and keyboard/mouse were
inoperative - had to use Magic SysRQ to sync and reboot.

The log showed
Oct 28 03:02:21 deluxe klogd: [ 31.347235] amdgpu 0000:09:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
Oct 28 03:02:34 deluxe klogd: [ 44.280185] amdgpu 0000:09:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706

I started bisecting after confireming that linus' tree with head at
f25a5481af12 still showed the problem. That identified the
following commit, which reverts cleanly and allows Xorg to start:

commit 714d9e4574d54596973ee3b0624ee4a16264d700
Author: Yifan Zhang <yifan1.zhang@xxxxxxx>
Date: Tue Sep 28 15:42:35 2021 +0800

drm/amdgpu: init iommu after amdkfd device init

This patch is to fix clinfo failure in Raven/Picasso:

Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 2.2 AMD-APP (3364.0)
Platform Name: AMD Accelerated Parallel Processing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callback

Platform Name: AMD Accelerated Parallel Processing Number of devices: 0

Signed-off-by: Yifan Zhang <yifan1.zhang@xxxxxxx>
Reviewed-by: James Zhu <James.Zhu@xxxxxxx>
Tested-by: James Zhu <James.Zhu@xxxxxxx>
Acked-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>
Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

I've got a laptop with raven, I'll try to find time to test it that
also shows he problem in the next few days.

ĸen
--
A capitalist society is one where individuals own and acquire
property, at least for a few months until cooler objects come out.
-- Late Night Mash