Re: linux-4.4 bisected: kwin5 stuck on kde5 loading screen with radeon
From: Michel DÃnzer
Date: Thu Jan 21 2016 - 03:37:01 EST
On 21.01.2016 16:58, Daniel Vetter wrote:
> On Thu, Jan 21, 2016 at 03:41:27PM +0900, Michel DÃnzer wrote:
>> On 21.01.2016 15:38, Michel DÃnzer wrote:
>>> On 21.01.2016 14:31, Mario Kleiner wrote:
>>>> On 01/21/2016 04:43 AM, Michel DÃnzer wrote:
>>>>> On 21.01.2016 05:32, Mario Kleiner wrote:
>>>>>>
>>>>>> So the problem is that AMDs hardware frame counters reset to
>>>>>> zero during a modeset. The old DRM code dealt with drivers doing that by
>>>>>> keeping vblank irqs enabled during modesets and incrementing vblank
>>>>>> count by one during each vblank irq, i think that's what
>>>>>> drm_vblank_pre_modeset() and drm_vblank_post_modeset() were meant for.
>>>>>
>>>>> Right, looks like there's been a regression breaking this. I suspect the
>>>>> problem is that vblank->last isn't getting updated from
>>>>> drm_vblank_post_modeset. Not sure which change broke that though, or how
>>>>> to fix it. Ville?
>>>>>
>>>>
>>>> The whole logic has changed and the software counter updates are now
>>>> driven all the time by the hw counter.
>>>>
>>>>>
>>>>> BTW, I'm seeing a similar issue with drm_vblank_on/off as well, which
>>>>> exposed the bug fixed by 209e4dbc ("drm/vblank: Use u32 consistently for
>>>>> vblank counters"). I've been meaning to track that down since then; one
>>>>> of these days hopefully, but if anybody has any ideas offhand...
>>>>
>>>> I spent the last few hours reading through the drm and radeon code and i
>>>> think what should probably work is to replace the
>>>> drm_vblank_pre/post_modeset calls in radeon/amdgpu by drm_vblank_off/on
>>>> calls. These are apparently meant for drivers whose hw counters reset
>>>> during modeset, [...]
>>>
>>> ... just like drm_vblank_pre/post_modeset. That those were broken is a
>>> regression which needs to be fixed anyway. I don't think switching to
>>> drm_vblank_on/off is suitable for stable trees.
>>
>> Even more so since as I mentioned, there is (has been since at least
>> about half a year ago) a counter jumping bug with drm_vblank_on/off as well.
>
> Hm, never noticed you reported that. I thought the reason for not picking
> up my drm_vblank_on/off patches was that there's a bug in amdgpu userspace
> where it tried to use vblank waits on a disabled pipe?
http://lists.freedesktop.org/archives/dri-devel/2015-July/086451.html
I don't know why it didn't get picked up.
> Can you please point me at the vblank on/off jump bug please?
AFAIR I originally reported it in response to
http://lists.freedesktop.org/archives/dri-devel/2015-August/087841.html
, but I can't find that in the archives, so maybe that was just on IRC.
See
http://lists.freedesktop.org/archives/dri-devel/2016-January/099122.html
. Basically, I ran into the bug fixed by your patch because the counter
jumped forward on every DPMS off, so it hit the 32-bit boundary after
just a few days.
--
Earthling Michel DÃnzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer