Re: Failed to create a rescuer kthread for the amdgpu-reset-dev workqueue

From: Christian König
Date: Mon Jan 15 2024 - 05:20:43 EST


Am 15.01.24 um 11:17 schrieb Thomas Perrot:
Hello Christian,

On Fri, 2024-01-12 at 09:17 +0100, Christian König wrote:
Well the driver load is interrupted for some reason.

Have you set any timeout for modprobe?

We don't set a modprobe timeout.

Well you somehow abort probing the driver.

This seems to be an external event and not something the driver can influence.

Regards,
Christian.


Kind regards,
Thomas

Regards,
Christian.

Am 12.01.24 um 09:11 schrieb Thomas Perrot:
Hello,

We are updating the kernel from the 6.1 to the 6.6 and we observe
an
amdgpu’s regression with Radeon RX580 8GB and SiFive Unmatched:
“workqueue: Failed to create a rescuer kthread for wq 'amdgpu-
reset-
dev': -EINTR
[drm:amdgpu_reset_create_reset_domain [amdgpu]] *ERROR* Failed to
allocate wq for amdgpu_reset_domain!
amdgpu 0000:07:00.0: amdgpu: Fatal error during GPU init
amdgpu 0000:07:00.0: amdgpu: amdgpu: finishing device.
amdgpu: probe of 0000:07:00.0 failed with error -12”

We tried to figure it out without success for the moment, do you
have
some advice to identify the root cause and to fix it?

Kind regards,
Thomas Perrot