Re: Linux 6.18.19 -- amdgpu bug and a new warning

From: Harshit Mogalapalli

Date: Thu Mar 26 2026 - 02:51:32 EST


Hi,


A commit in 6.18.19 has introduced a bug and a new warning when doing
amdgpu driver re-binding. In addition to the bug, the last line of the
output below is a new warning re: the thermal alert

This bug doesn't seem to cause any show-stopping problems, but it is a bug
and it persists into 6.18.20.

I can do a bisect if needed, but I'm hoping one of our AMD guys can more
quickly spot what's going on :)

Are you saying it is from 6.18.18 to 6.18.19 it was introduced?  Nothing immediately jumps out to me.  So I would say bisect please.


I think backporting this would help ?

commit: e12603bf2c3d ("drm/amd/pm: fix amdgpu_irq enabled counter unbalanced on smu v11.0")



   amdgpu 0000:14:00.0: amdgpu: amdgpu: finishing device.
   ------------[ cut here ]------------
   WARNING: CPU: 1 PID: 2773 at drivers/gpu/drm/amd/amdgpu/ amdgpu_irq.c:639 amdgpu_irq_put+0xa4/0xc0 [amdgpu]
...
   CPU: 1 UID: 0 PID: 2773 Comm: bind-device.sh Not tainted 6.18.20 #1 PREEMPT(lazy)
..
    <TASK>
    smu_smc_hw_cleanup+0x61/0x490 [amdgpu]
    smu_hw_fini+0xef/0x180 [amdgpu]
    amdgpu_ip_block_hw_fini+0x37/0x41 [amdgpu]
Thanks,
Harshit