RE: Linux 6.1-rc1 drm/amdgpu regression

From: Deucher, Alexander
Date: Wed Oct 19 2022 - 17:24:21 EST


[Public]

> -----Original Message-----
> From: Shuah Khan <skhan@xxxxxxxxxxxxxxxxxxx>
> Sent: Wednesday, October 19, 2022 5:00 PM
> To: Deucher, Alexander <Alexander.Deucher@xxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>; linux-
> kernel@xxxxxxxxxxxxxxx; Shuah Khan <skhan@xxxxxxxxxxxxxxxxxxx>
> Subject: Re: Linux 6.1-rc1 drm/amdgpu regression
>
> On 10/19/22 14:27, Deucher, Alexander wrote:
> > [AMD Official Use Only - General]
> >
> >> -----Original Message-----
> >> From: Shuah Khan <skhan@xxxxxxxxxxxxxxxxxxx>
> >> Sent: Wednesday, October 19, 2022 4:00 PM
> >> To: Deucher, Alexander <Alexander.Deucher@xxxxxxx>
> >> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>; Shuah Khan
> >> <skhan@xxxxxxxxxxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
> >> Subject: Linux 6.1-rc1 drm/amdgpu regression
> >>
> >> Hi Alex,
> >>
> >> I am seeing the same problem I sent reverts for on 5.10.147 on Linux
> >> 6.1-rc1 on my laptop with AMD Ryzen 7 PRO 5850U with Radeon Graphics.
> >>
> >> commit e3163bc8ffdfdb405e10530b140135b2ee487f89
> >> Author: Alex Deucher <alexander.deucher@xxxxxxx>
> >> Date: Fri Sep 9 11:53:27 2022 -0400
> >>
> >> drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for
> >> vega
> >>
> >> I see that the following has been reverted in Linux 6.1-rc1
> >>
> >> commit 66f99628eb24409cb8feb5061f78283c8b65f820
> >> Author: Hamza Mahfooz <hamza.mahfooz@xxxxxxx>
> >> Date: Tue Sep 6 15:01:49 2022 -0400
> >>
> >> drm/amdgpu: use dirty framebuffer helper
> >>
> >> However I still see the following filling dmesg and system is unusable.
> >> For now I switched back to Linux 6.0 as this is my primary system.
> >>
> >> [drm] Fence fallback timer expired on ring sdma0 [drm] Fence fallback
> >> timer expired on ring gfx [drm] Fence fallback timer expired on ring
> >> sdma0 [drm] Fence fallback timer expired on ring gfx [drm] Fence
> >> fallback timer expired on ring sdma0 [drm] Fence fallback timer
> >> expired on ring sdma0 [drm] Fence fallback timer expired on ring
> >> sdma0 [drm] Fence fallback timer expired on ring gfx
> >>
> >> Please let me know if I should send revert for this for the mainline as well.
> >>
> >
> > Can you file a bug report
> (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitl
> ab.freedesktop.org%2Fdrm%2Famd%2F-
> %2Fissues&amp;data=05%7C01%7CAlexander.Deucher%40amd.com%7C61b
> 64b1be7294b27eb2308dab214dbe2%7C3dd8961fe4884e608e11a82d994e183d
> %7C0%7C0%7C638018099904584274%7CUnknown%7CTWFpbGZsb3d8eyJWIj
> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3
> 000%7C%7C%7C&amp;sdata=ZYA0bWZAGsxB91Bqcg1YAI704LhpISQX63bE67
> UVO%2Bs%3D&amp;reserved=0) and attach your dmesg output? I'd like to
> try and repro the issue if I can and provide some patches to test. I'd like to
> avoid reverting the patch as that will break the driver for users using vega
> dGPUs.
>
> Makes sense. I will file the bug and aattach dmesg. Since this is my primary
> system, there will be some delay in getting this info. to you and testing any
> patches you provide for testing.
>

Actually I think I see what's wrong. Can you try the attached patch?

Alex

Attachment: 0001-drm-amdgpu-fix-sdma-doorbell-init-ordering-on-APUs.patch
Description: 0001-drm-amdgpu-fix-sdma-doorbell-init-ordering-on-APUs.patch