Re: AMDGPU: regression on 5.17.1
From: Alex Deucher
Date: Mon Apr 11 2022 - 14:34:54 EST
On Sat, Apr 9, 2022 at 12:28 PM Michele Ballabio <ballabio.m@xxxxxxxxx> wrote:
>
> On Tue, 5 Apr 2022 10:23:16 -0400
> Alex Deucher <alexdeucher@xxxxxxxxx> wrote:
>
> > On Mon, Apr 4, 2022 at 3:39 PM Michele Ballabio
> > <ballabio.m@xxxxxxxxx> wrote:
> > >
> > > On Mon, 4 Apr 2022 13:03:41 -0400
> > > Alex Deucher <alexdeucher@xxxxxxxxx> wrote:
> > >
> > > > On Sun, Apr 3, 2022 at 10:19 AM Michele Ballabio
> > > > <ballabio.m@xxxxxxxxx> wrote:
> > > > >
> > > > > Hi,
> > > > > I've hit a regression on 5.17.1 (haven't tested 5.17.0,
> > > > > but 5.16-stable didn't have this problem).
> > > > >
> > > > > The machine is a Ryzen 5 1600 with AMD graphics (RX 560).
> > > > >
> > > > > The regression I hit seems to trigger when the machine is left
> > > > > idle at boot (I don't boot straight to X, I boot to a tty, login
> > > > > and then start X). The machine after a while blanks the screen.
> > > > > Usually, the screen unblanks as the keyboard is hit or the mouse
> > > > > moves, but with kernel 5.17.1 the screen does not wake up. The
> > > > > machine seems to run mostly fine: I can login from ssh, but I
> > > > > cannot reboot or halt it: a sysrq sequence is needed for that.
> > > > > Note that if the screen goes blank under X, it wakes up fine.
> > > > >
> > > > > Below a dmesg and two traces from syslog (they're quite
> > > > > similar).
> > > >
> > > > Can you bisect? Does setting amdgpu.runpm=0 help?
> > >
> > > I can try to bisect, should I narrow the search to drivers/gpu/drm/
> > > ?
> >
> > I would just do a full bisect if possible in case the change happens
> > to be outside of drm.
> >
> > >
> > > Setting amdgpu.runpm=0 works, the display now unblanks without
> > > problems.
> >
>
> Hi,
> I bisected this, and the first bad commit is
> [087451f372bf76d971184caa258807b7c35aac8f] drm/amdgpu: use generic fb
> helpers instead of setting up AMD own's.
>
> Let me know if you need some more testing.
Thanks. Do the attached patches fix the issue?
Thanks,
Alex
From be5c2680ed4b3b4121af4f692c905d88d7b6cc00 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@xxxxxxx>
Date: Tue, 28 Dec 2021 17:26:24 -0500
Subject: [PATCH 1/2] drm/amdgpu: don't runtime suspend if there are displays
attached
We normally runtime suspend when there are displays attached if they
are in the DPMS off state, however, if something wakes the GPU
we send a hotplug event on resume (in case any displays were connected
while the GPU was in suspend) which can cause userspace to light
up the displays again soon after they were turned off.
Prior to
commit 087451f372bf76 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's."),
the driver took a runtime pm reference when the fbdev emulation was
enabled because we didn't implement proper shadowing support for
vram access when the device was off so the device never runtime
suspended when there was a console bound. Once that commit landed,
we now utilize the core fb helper implementation which properly
handles the emulation, so runtime pm now suspends in cases where it did
not before. Ultimately, we need to sort out why runtime suspend in not
working in this case for some users, but this should restore similar
behavior to before.
Fixes: 087451f372bf76 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 4efaa183abcd..cf795a9b8aef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2508,6 +2508,8 @@ static int amdgpu_pmops_runtime_idle(struct device *dev)
{
struct drm_device *drm_dev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(drm_dev);
+ struct drm_connector *list_connector;
+ struct drm_connector_list_iter iter;
/* we don't want the main rpm_idle to call suspend - we want to autosuspend */
int ret = 1;
@@ -2516,6 +2518,22 @@ static int amdgpu_pmops_runtime_idle(struct device *dev)
return -EBUSY;
}
+ /* XXX: Return busy if any displays are connected to avoid
+ * possible display wake ups after runtime resume due to
+ * hotplug events in case any displays were connected while
+ * the GPU was in suspend. Remove this once that is fixed.
+ */
+ mutex_lock(&drm_dev->mode_config.mutex);
+ drm_connector_list_iter_begin(drm_dev, &iter);
+ drm_for_each_connector_iter(list_connector, &iter) {
+ if (list_connector->status == connector_status_connected) {
+ ret = -EBUSY;
+ break;
+ }
+ }
+ drm_connector_list_iter_end(&iter);
+ mutex_unlock(&drm_dev->mode_config.mutex);
+
if (amdgpu_device_has_dc_support(adev)) {
struct drm_crtc *crtc;
@@ -2527,11 +2545,7 @@ static int amdgpu_pmops_runtime_idle(struct device *dev)
if (ret < 0)
break;
}
-
} else {
- struct drm_connector *list_connector;
- struct drm_connector_list_iter iter;
-
mutex_lock(&drm_dev->mode_config.mutex);
drm_modeset_lock(&drm_dev->mode_config.connection_mutex, NULL);
--
2.35.1
From 498311adc8ce0746c0c8710961e8f6dbc9774262 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@xxxxxxx>
Date: Tue, 28 Dec 2021 17:32:55 -0500
Subject: [PATCH 2/2] drm/amdgpu: only check display if the GPU has them in
runpm idle
In the runtime pm idle callback, only check display conditions
if the GPU has display hardware. This saves some code
execution on headless GPUs.
Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 76 +++++++++++++------------
1 file changed, 39 insertions(+), 37 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index cf795a9b8aef..eac0a35c0cc9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2518,53 +2518,55 @@ static int amdgpu_pmops_runtime_idle(struct device *dev)
return -EBUSY;
}
- /* XXX: Return busy if any displays are connected to avoid
- * possible display wake ups after runtime resume due to
- * hotplug events in case any displays were connected while
- * the GPU was in suspend. Remove this once that is fixed.
- */
- mutex_lock(&drm_dev->mode_config.mutex);
- drm_connector_list_iter_begin(drm_dev, &iter);
- drm_for_each_connector_iter(list_connector, &iter) {
- if (list_connector->status == connector_status_connected) {
- ret = -EBUSY;
- break;
- }
- }
- drm_connector_list_iter_end(&iter);
- mutex_unlock(&drm_dev->mode_config.mutex);
-
- if (amdgpu_device_has_dc_support(adev)) {
- struct drm_crtc *crtc;
-
- drm_for_each_crtc(crtc, drm_dev) {
- drm_modeset_lock(&crtc->mutex, NULL);
- if (crtc->state->active)
- ret = -EBUSY;
- drm_modeset_unlock(&crtc->mutex);
- if (ret < 0)
- break;
- }
- } else {
+ if (adev->mode_info.num_crtc) {
+ /* XXX: Return busy if any displays are connected to avoid
+ * possible display wake ups after runtime resume due to
+ * hotplug events in case any displays were connected while
+ * the GPU was in suspend. Remove this once that is fixed.
+ */
mutex_lock(&drm_dev->mode_config.mutex);
- drm_modeset_lock(&drm_dev->mode_config.connection_mutex, NULL);
-
drm_connector_list_iter_begin(drm_dev, &iter);
drm_for_each_connector_iter(list_connector, &iter) {
- if (list_connector->dpms == DRM_MODE_DPMS_ON) {
+ if (list_connector->status == connector_status_connected) {
ret = -EBUSY;
break;
}
}
-
drm_connector_list_iter_end(&iter);
-
- drm_modeset_unlock(&drm_dev->mode_config.connection_mutex);
mutex_unlock(&drm_dev->mode_config.mutex);
- }
- if (ret == -EBUSY)
- DRM_DEBUG_DRIVER("failing to power off - crtc active\n");
+ if (amdgpu_device_has_dc_support(adev)) {
+ struct drm_crtc *crtc;
+
+ drm_for_each_crtc(crtc, drm_dev) {
+ drm_modeset_lock(&crtc->mutex, NULL);
+ if (crtc->state->active)
+ ret = -EBUSY;
+ drm_modeset_unlock(&crtc->mutex);
+ if (ret < 0)
+ break;
+ }
+ } else {
+ mutex_lock(&drm_dev->mode_config.mutex);
+ drm_modeset_lock(&drm_dev->mode_config.connection_mutex, NULL);
+
+ drm_connector_list_iter_begin(drm_dev, &iter);
+ drm_for_each_connector_iter(list_connector, &iter) {
+ if (list_connector->dpms == DRM_MODE_DPMS_ON) {
+ ret = -EBUSY;
+ break;
+ }
+ }
+
+ drm_connector_list_iter_end(&iter);
+
+ drm_modeset_unlock(&drm_dev->mode_config.connection_mutex);
+ mutex_unlock(&drm_dev->mode_config.mutex);
+ }
+
+ if (ret == -EBUSY)
+ DRM_DEBUG_DRIVER("failing to power off - crtc active\n");
+ }
pm_runtime_mark_last_busy(dev);
pm_runtime_autosuspend(dev);
--
2.35.1