Re: Regression: drm/radeon: brightness control hard system lockup
From: Alex Deucher
Date: Mon Jan 14 2013 - 11:07:55 EST
On Sun, Jan 13, 2013 at 4:00 PM, Eldad Zack <eldad@xxxxxxxxxxxxxxx> wrote:
>
> On Mon, 7 Jan 2013, Alex Deucher wrote:
>> On Mon, Jan 7, 2013 at 4:33 PM, Eldad Zack <eldad@xxxxxxxxxxxxxxx> wrote:
>> >
>> > On Mon, 7 Jan 2013, Alex Deucher wrote:
>> >> On Sun, Jan 6, 2013 at 7:59 AM, Eldad Zack <eldad@xxxxxxxxxxxxxxx> wrote:
>> >> >
>> >> > Hi Alex,
>> >> >
>> >> > Commit 0ecebb9e0d14e9948e0b1529883a776758117d6f "drm/radeon: switch to a
>> >> > finer grained reset for evergreen" introduced a hard system lockup to my
>> >> > setup. I found it after bisecting, and confirmed it by reverting it on
>> >> > the latest mainline ( 5f243b9 ).
>> >> >
>> >> > This:
>> >> >
>> >> > echo 7 > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/backlight/acpi_video0/brightness
>> >> >
>> >> > Causes a hard lock-up hard, i.e. immediate freeze, without any logs.
>> >> >
>> >> > See lspci output and kernel .config below.
>> >> > If there's any more info I can provide, please let me know.
>> >>
>> >> Do you normally see GPU resets when changing the backlight? Please
>> >> attach your dmesg output when changing the backlight with the patch
>> >> reverted.
>> >
>> > I see nothing. Just to make sure, I cleared the buffer, cycled through
>> > 0-7 a couple of hunderd times (until the flicker annoyed), but I see no
>> > messages at all.
>> > Is there any debug config I should turn on?
>>
>> Can you try adding a printk() in evergreen_asic_reset() and see if it
>> is somehow getting called when you change the brightness? When you
>> use the apci backlight control, the radeon driver is not involved at
>> all. They only way the driver would get involved is if the acpi
>> backlight control somehow caused the GPU to hang and then the driver
>> detected the hang and attempted to reset the GPU. I don't see any
>> evidence of a GPU reset in your kernel log however. Note that the
>> driver also registers native backlight contol. Does that work any
>> better than acpi?
>
> The native backlight controls work very well. Thanks for that, I didn't
> even noticed that. It has finer control over brightness too.
>
> I worked out a fix for the problem, but I think it's not a proper one.
> What I noticed is that evergreen_gpu_soft_reset() is only ever called
> once on my system, at boot.
> Then I realized from the dmesg that neither
> evergreen_gpu_soft_reset_dma() nor evergreen_gpu_soft_reset_gfx() actually
> do anything. Both return on the first if statements there.
>
> So as far as I can tell, the difference your patch introduced is calling
> evergreen_mc_stop() and evergreen_mc_resume(), which somehow puts my
> system in some state that ACPI brightness control leads to a lock up.
>
> BTW, I don't see a GPU reset happening at all - I also tried suspend/resume
> and starting an OpenGL application. Do you know how I can trigger it?
>
> To fix this, I moved the GUI_ACTIVE test before evergreen_mc_stop(), but
> I think it just masks the issue.
> Patch below (against latest HEAD) just in case.
>
Thanks for tracking this down. The attached patch should fix the
issue properly.
Alex
> Cheers,
> Eldad
>
> From 5817128d2761f60051b069d2bb31209c909b6a04 Mon Sep 17 00:00:00 2001
> From: Eldad Zack <eldad@xxxxxxxxxxxxxxx>
> Date: Sun, 13 Jan 2013 21:14:08 +0100
> Subject: [PATCH] drm/radeon: fix evergreen brightness control regression
>
> Commit 0ecebb9e0d14e9948e0b1529883a776758117d6f introduced a
> regression, where using ACPI brightness control leads to a
> hard system lock-up.
> To resolve the issue, this patch moves the GUI_ACTIVE test
> earlier, as it was before the commit.
>
> Signed-off-by: Eldad Zack <eldad@xxxxxxxxxxxxxxx>
> ---
> drivers/gpu/drm/radeon/evergreen.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c
> index 061fa0a..1392f99 100644
> --- a/drivers/gpu/drm/radeon/evergreen.c
> +++ b/drivers/gpu/drm/radeon/evergreen.c
> @@ -2310,9 +2310,6 @@ static void evergreen_gpu_soft_reset_gfx(struct radeon_device *rdev)
> {
> u32 grbm_reset = 0;
>
> - if (!(RREG32(GRBM_STATUS) & GUI_ACTIVE))
> - return;
> -
> dev_info(rdev->dev, " GRBM_STATUS = 0x%08X\n",
> RREG32(GRBM_STATUS));
> dev_info(rdev->dev, " GRBM_STATUS_SE0 = 0x%08X\n",
> @@ -2404,6 +2401,9 @@ static int evergreen_gpu_soft_reset(struct radeon_device *rdev, u32 reset_mask)
> if (reset_mask == 0)
> return 0;
>
> + if (!(RREG32(GRBM_STATUS) & GUI_ACTIVE))
> + return 0;
> +
> dev_info(rdev->dev, "GPU softreset: 0x%08X\n", reset_mask);
>
> evergreen_mc_stop(rdev, &save);
> --
> 1.7.12.4
>
From cb5206aa1c6c52b1b459eaf93797c593cff0c8db Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@xxxxxxx>
Date: Mon, 14 Jan 2013 11:04:39 -0500
Subject: [PATCH] drm/radeon: clear reset flags if engines are idle
Fixes a regression in the gpu reset code after the
rework for DMA support.
Reported-by: Eldad Zack <eldad@xxxxxxxxxxxxxxx>
Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
---
drivers/gpu/drm/radeon/evergreen.c | 6 ++++++
drivers/gpu/drm/radeon/ni.c | 6 ++++++
drivers/gpu/drm/radeon/r600.c | 6 ++++++
drivers/gpu/drm/radeon/si.c | 6 ++++++
4 files changed, 24 insertions(+), 0 deletions(-)
diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c
index 061fa0a..4d0e60a 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -2401,6 +2401,12 @@ static int evergreen_gpu_soft_reset(struct radeon_device *rdev, u32 reset_mask)
{
struct evergreen_mc_save save;
+ if (!(RREG32(GRBM_STATUS) & GUI_ACTIVE))
+ reset_mask &= ~(RADEON_RESET_GFX | RADEON_RESET_COMPUTE);
+
+ if (RREG32(DMA_STATUS_REG) & DMA_IDLE)
+ reset_mask &= ~RADEON_RESET_DMA;
+
if (reset_mask == 0)
return 0;
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 896f1cb..59acabb 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1409,6 +1409,12 @@ static int cayman_gpu_soft_reset(struct radeon_device *rdev, u32 reset_mask)
{
struct evergreen_mc_save save;
+ if (!(RREG32(GRBM_STATUS) & GUI_ACTIVE))
+ reset_mask &= ~(RADEON_RESET_GFX | RADEON_RESET_COMPUTE);
+
+ if (RREG32(DMA_STATUS_REG) & DMA_IDLE)
+ reset_mask &= ~RADEON_RESET_DMA;
+
if (reset_mask == 0)
return 0;
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 537e259..3cb9d60 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -1378,6 +1378,12 @@ static int r600_gpu_soft_reset(struct radeon_device *rdev, u32 reset_mask)
{
struct rv515_mc_save save;
+ if (!(RREG32(GRBM_STATUS) & GUI_ACTIVE))
+ reset_mask &= ~(RADEON_RESET_GFX | RADEON_RESET_COMPUTE);
+
+ if (RREG32(DMA_STATUS_REG) & DMA_IDLE)
+ reset_mask &= ~RADEON_RESET_DMA;
+
if (reset_mask == 0)
return 0;
diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c
index 3240a3d..ae8b482 100644
--- a/drivers/gpu/drm/radeon/si.c
+++ b/drivers/gpu/drm/radeon/si.c
@@ -2215,6 +2215,12 @@ static int si_gpu_soft_reset(struct radeon_device *rdev, u32 reset_mask)
{
struct evergreen_mc_save save;
+ if (!(RREG32(GRBM_STATUS) & GUI_ACTIVE))
+ reset_mask &= ~(RADEON_RESET_GFX | RADEON_RESET_COMPUTE);
+
+ if (RREG32(DMA_STATUS_REG) & DMA_IDLE)
+ reset_mask &= ~RADEON_RESET_DMA;
+
if (reset_mask == 0)
return 0;
--
1.7.7.5