Re: [Intel-gfx] Skylake graphics regression: projector failure with 4.8-rc3

From: James Bottomley
Date: Mon Sep 19 2016 - 11:09:45 EST


On Sun, 2016-09-18 at 13:35 +0200, Thorsten Leemhuis wrote:
> Hi! James & Paulo: What's the current status of this?

No, the only interaction has been the suggestion below for a revert,
which didn't fix the problem.

> Was this issue discussed elsewhere or even fixed in between? Just
> asking, because this issue is on the list of regressions for 4.8.


I'm just about to try out -rc7, but it's not fixed so far.

James


> Ciao, Thorsten
>
> On 01.09.2016 00:25, James Bottomley wrote:
> > On Wed, 2016-08-31 at 21:51 +0000, Zanoni, Paulo R wrote:
> > > Em Qua, 2016-08-31 Ãs 14:43 -0700, James Bottomley escreveu:
> > > > On Wed, 2016-08-31 at 11:23 -0700, James Bottomley wrote:
> > > > > On Fri, 2016-08-26 at 09:10 -0400, James Bottomley wrote:
> > > > > > We seem to have an xrandr regression with skylake now.
> > > > > > What's
> > > > > > happening is that I can get output on to a projector, but
> > > > > > the
> > > > > > system is losing video when I change the xrandr sessions
> > > > > > (like
> > > > > > going from a --above b to a --same-as b). The main screen
> > > > > > goes
> > > > > > blank, which is basically a reboot situation.
> > > > > > Unfortunately, I
> > > > > > can't seem to get the logs out of systemd to see if there
> > > > > > was a
> > > > > > dump to dmesg (the system was definitely responding).
> > > > > >
> > > > > > I fell back to 4.6.2 which worked perfectly, so this is
> > > > > > definitely
> > > > > > some sort of regression. I'll be able to debug more fully
> > > > > > when
> > > > > > I
> > > > > > get back home from the Linux Security Summit.
> > > > >
> > > > > I'm home now. Unfortunately, my monitor isn't as problematic
> > > > > as
> > > > > the
> > > > > projector, but by flipping between various modes and
> > > > > separating
> > > > > and
> > > > > overlaying the panels with --above and --same-as (xrandr), I
> > > > > can
> > > > > eventually get it to the point where the main LCD panel goes
> > > > > black
> > > > > and can only be restarted by specifying a different mode.
> > > > >
> > > > > This seems to be associated with these lines in the X
> > > > >
> > > > > [ 14714.389] (EE) intel(0): failed to set mode: Invalid
> > > > > argument
> > > > > [22]
> > > > >
> > > > > But the curious thing is that even if this fails with the
> > > > > error
> > > > > message once, it may succeed a second time, so it looks to be
> > > > > a
> > > > > transient error translation problem from the kernel driver.
> > > > >
> > > > > I've attached the full log below.
> > > > >
> > > > > This is only with a VGA output. I currently don't have a
> > > > > HDMI
> > > > > dongle, but I'm in the process of acquiring one.
> > > >
> > > > After more playing around, I'm getting thousands of these in
> > > > the
> > > > kernel
> > > > log (possibly millions: the log wraps very fast):
> > > >
> > > > [23504.873606] [drm:intel_dp_start_link_train [i915]] *ERROR*
> > > > failed
> > > > to train DP, aborting
> > > >
> > > > And then finally it gives up with
> > > >
> > > > [25023.770951] [drm:intel_cpu_fifo_underrun_irq_handler [i915]]
> > > > *ERROR* CPU pipe B FIFO underrun
> > > > [25561.926075] [drm:intel_cpu_fifo_underrun_irq_handler [i915]]
> > > > *ERROR* CPU pipe A FIFO underrun
> > > >
> > > > And the crtc for the VGA output becomes non-responsive to any
> > > > configuration command. This requires a reboot and sometimes a
> > > > UEFI
> > > > variable reset before it comes back.
> > >
> > > Please see this discussion:
> > > https://patchwork.freedesktop.org/patch/103237/
> > >
> > > Do you have this patch on your tree? Does the problem go away if
> > > you
> > > revert it?
> >
> > Yes, I've got it, it went in in 4.8-rc3 according to git:
> >
> > commit 58e311b09c319183254d9220c50a533e7157c9ab
> > Author: Matt Roper <matthew.d.roper@xxxxxxxxx>
> > Date: Thu Aug 4 14:08:00 2016 -0700
> >
> > drm/i915/gen9: Give one extra block per line for SKL plane WM
> > calculations
> >
> > Reverting it causes the secondary display not to sync pretty much
> > at
> > all. However, in the flickers I can see, it does work OK and
> > doesn't
> > now crash switching from --same-as to --above and back
> >
> > I also still get the logs filling up with the link training errors.
> >
> > On balance, although the behaviour is different, it's not an
> > improvement because if I can't sync with the projector, I can't
> > really
> > use this as a fix.
> >
> > James
>