RE: [PATCH 5.10 637/717] drm/amd/display: Fix memory leaks in S3 resume

From: Deucher, Alexander
Date: Tue Jan 05 2021 - 11:42:27 EST


[AMD Public Use]

> -----Original Message-----
> From: Andre Tomt <andre@xxxxxxxx>
> Sent: Tuesday, January 5, 2021 11:32 AM
> To: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx; stable@xxxxxxxxxxxxxxx; Wentland, Harry
> <Harry.Wentland@xxxxxxx>; Deucher, Alexander
> <Alexander.Deucher@xxxxxxx>; Kazlauskas, Nicholas
> <Nicholas.Kazlauskas@xxxxxxx>; Wang, Chao-kai (Stylon)
> <Stylon.Wang@xxxxxxx>
> Subject: Re: [PATCH 5.10 637/717] drm/amd/display: Fix memory leaks in S3
> resume
>
> On 05.01.2021 07:54, Greg Kroah-Hartman wrote:
> > On Mon, Jan 04, 2021 at 08:04:08PM +0100, Andre Tomt wrote:
> >> On 28.12.2020 13:50, Greg Kroah-Hartman wrote:
> >>> From: Stylon Wang <stylon.wang@xxxxxxx>
> >>>
> >>> commit a135a1b4c4db1f3b8cbed9676a40ede39feb3362 upstream.
> >>>
> >>> EDID parsing in S3 resume pushes new display modes to probed_modes
> >>> list but doesn't consolidate to actual mode list. This creates a
> >>> race condition when
> >>> amdgpu_dm_connector_ddc_get_modes() re-initializes the list head
> >>> without walking the list and results in memory leak.
> >>
> >> This commit is causing me problems on 5.10.4: when I turn off the
> >> display (a LG TV in this case), and turn it back on again later there
> >> is no video output and I get the following in the kernel log:
> >>
> >> [ 8245.259628] [drm:dm_restore_drm_connector_state [amdgpu]]
> *ERROR*
> >> Restoring old state failed with -12
> >>
> >> I've found another report on this commit as well:
> >>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbug
> >>
> zilla.kernel.org%2Fshow_bug.cgi%3Fid%3D211033&amp;data=04%7C01%7Cal
> ex
> >>
> ander.deucher%40amd.com%7Cad673e351bab4f6af94508d8b1977ed8%7C3d
> d8961f
> >>
> e4884e608e11a82d994e183d%7C0%7C0%7C637454612140560971%7CUnknow
> n%7CTWF
> >>
> pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV
> CI6
> >>
> Mn0%3D%7C3000&amp;sdata=8Rsnbfh4P5GmFUlybb31mT7C0Ee4vDInxJ1gt
> C3jrVI%3
> >> D&amp;reserved=0
> >>
> >> And I suspect this is the same:
> >>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbug
> >>
> s.archlinux.org%2Ftask%2F69202&amp;data=04%7C01%7Calexander.deuche
> r%4
> >>
> 0amd.com%7Cad673e351bab4f6af94508d8b1977ed8%7C3dd8961fe4884e608
> e11a82
> >>
> d994e183d%7C0%7C0%7C637454612140560971%7CUnknown%7CTWFpbGZsb
> 3d8eyJWIj
> >>
> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3
> 000&a
> >>
> mp;sdata=1149tCcm3rcaj1MkVPbWqWhWFIPgkeBoYxo0oVv%2FzNI%3D&a
> mp;reserve
> >> d=0
> >>
> >> Reverting it from 5.10.4 makes things behave again.
> >>
> >> Have not tested 5.4.86 or 5.11-rc.
> >>
> >> I'm using a RX570 Polaris based card.
> >
> > Can you test 5.11-rc to see if this issue is there as well?
>
> Just did, and have the same issue on 5.11-rc2. Reverting it also solves the
> problem on 5.11-rc2, as it does on 5.10.4
>
> FWIW one easy way to reproduce seems to be unplugging and re-plugging
> the HDMI.

We are looking into the root cause, but I'll send out the revert for now.

Thanks,

Alex