Re: [Intel-gfx] Skylake graphics regression: projector failure with 4.8-rc3

From: Thorsten Leemhuis
Date: Sun Sep 18 2016 - 07:35:29 EST

Hi! James & Paulo: What's the current status of this? Was this issue
discussed elsewhere or even fixed in between? Just asking, because this
issue is on the list of regressions for 4.8. Ciao, Thorsten

On 01.09.2016 00:25, James Bottomley wrote:
> On Wed, 2016-08-31 at 21:51 +0000, Zanoni, Paulo R wrote:
>> Em Qua, 2016-08-31 Ãs 14:43 -0700, James Bottomley escreveu:
>>> On Wed, 2016-08-31 at 11:23 -0700, James Bottomley wrote:
>>>> On Fri, 2016-08-26 at 09:10 -0400, James Bottomley wrote:
>>>>> We seem to have an xrandr regression with skylake now. What's
>>>>> happening is that I can get output on to a projector, but the
>>>>> system is losing video when I change the xrandr sessions (like
>>>>> going from a --above b to a --same-as b). The main screen goes
>>>>> blank, which is basically a reboot situation. Unfortunately, I
>>>>> can't seem to get the logs out of systemd to see if there was a
>>>>> dump to dmesg (the system was definitely responding).
>>>>> I fell back to 4.6.2 which worked perfectly, so this is
>>>>> definitely
>>>>> some sort of regression. I'll be able to debug more fully when
>>>>> I
>>>>> get back home from the Linux Security Summit.
>>>> I'm home now. Unfortunately, my monitor isn't as problematic as
>>>> the
>>>> projector, but by flipping between various modes and separating
>>>> and
>>>> overlaying the panels with --above and --same-as (xrandr), I can
>>>> eventually get it to the point where the main LCD panel goes
>>>> black
>>>> and can only be restarted by specifying a different mode.
>>>> This seems to be associated with these lines in the X
>>>> [ 14714.389] (EE) intel(0): failed to set mode: Invalid argument
>>>> [22]
>>>> But the curious thing is that even if this fails with the error
>>>> message once, it may succeed a second time, so it looks to be a
>>>> transient error translation problem from the kernel driver.
>>>> I've attached the full log below.
>>>> This is only with a VGA output. I currently don't have a HDMI
>>>> dongle, but I'm in the process of acquiring one.
>>> After more playing around, I'm getting thousands of these in the
>>> kernel
>>> log (possibly millions: the log wraps very fast):
>>> [23504.873606] [drm:intel_dp_start_link_train [i915]] *ERROR*
>>> failed
>>> to train DP, aborting
>>> And then finally it gives up with
>>> [25023.770951] [drm:intel_cpu_fifo_underrun_irq_handler [i915]]
>>> *ERROR* CPU pipe B FIFO underrun
>>> [25561.926075] [drm:intel_cpu_fifo_underrun_irq_handler [i915]]
>>> *ERROR* CPU pipe A FIFO underrun
>>> And the crtc for the VGA output becomes non-responsive to any
>>> configuration command. This requires a reboot and sometimes a UEFI
>>> variable reset before it comes back.
>> Please see this discussion:
>> Do you have this patch on your tree? Does the problem go away if you
>> revert it?
> Yes, I've got it, it went in in 4.8-rc3 according to git:
> commit 58e311b09c319183254d9220c50a533e7157c9ab
> Author: Matt Roper <matthew.d.roper@xxxxxxxxx>
> Date: Thu Aug 4 14:08:00 2016 -0700
> drm/i915/gen9: Give one extra block per line for SKL plane WM
> calculations
> Reverting it causes the secondary display not to sync pretty much at
> all. However, in the flickers I can see, it does work OK and doesn't
> now crash switching from --same-as to --above and back
> I also still get the logs filling up with the link training errors.
> On balance, although the behaviour is different, it's not an
> improvement because if I can't sync with the projector, I can't really
> use this as a fix.
> James