Re: PROBLEM: i915 causes complete desktop freezes in 4.15-rc5

From: Alexandru Chirvasitu
Date: Sat Jan 06 2018 - 13:44:38 EST


Thanks!

It's also a mystery to me why I never had any crashes on any of the
other systems running on this machine running the same (unpatched)
kernels.

I'm assuming the window manager might have something to do with it:
all of the others are on i3 and the buggy one's openbox, so perhaps
tiling vs. stacking makes a difference?

The one pattern I noticed to the crashes was that they occurred upon
opening a new window.

On Sat, Jan 06, 2018 at 05:34:51PM +0000, Chris Wilson wrote:
> Quoting Alexandru Chirvasitu (2018-01-06 16:38:35)
> > On Sat, Jan 06, 2018 at 08:24:43AM -0500, Alexandru Chirvasitu wrote:
> > > Thank you!
> > >
> > > I'll apply that more elaborate patch you sent in the longer message to
> > > my clone of the repo and see if it still freezes.
> > >
> >
> > I'm on it now with no freezes yet, despite trying my best :).
> >
> > I have a question though:
> >
> > > On Sat, Jan 06, 2018 at 10:43:20AM +0000, Chris Wilson wrote:
> > > > Quoting Alexandru Chirvasitu (2018-01-05 22:05:18)
> > > > > Here we go.
> > > > >
> > > > > I have
> > > > >
> > > > > CONFIG_PAGE_POISONING not set
> > > > > CONFIG_SLUB_STATS=y
> > > > > CONFIG_SLUB_DEBUG not set
> > > > > CONFIG_KASAN=y
> > > > >
> > > > > .config attached along as well for verification, in case I missed
> > > > > anything.
> > > > >
> > > > > Again crashed by an attempt to open a terminal window.
> > > >
> > > > Gotcha,
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > > index b21322b50419..96cf46a10b4e 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -472,7 +472,7 @@ static void __fence_set_priority(struct dma_fence *fence, int prio)
> > > > struct drm_i915_gem_request *rq;
> > > > struct intel_engine_cs *engine;
> > > >
> > > > - if (!dma_fence_is_i915(fence))
> > > > + if (dma_fence_is_signaled(fence) || !dma_fence_is_i915(fence))
> > > > return;
> > > >
> > > > rq = to_request(fence);
> >
> > I went back to Linus' tree and compared the respective i915_gem.c
> > files in the 4.14 and 4.15-rc6 commits. The offending piece of code
> > seems to be in both, so I am wondering why I was not getting freezes before 4.15-rc.
>
> Yeah, I debated adding a fixes for commit 6b5e90f58c56
> ("drm/i915/scheduler: Boost priorities for flips") that introduced this
> code, but decided it's just an optimisation at this point and that we
> should only regard commit 1f181225f8ec ("drm/i915/execlists: Keep
> request->priority for its lifetime") for introducing the breakage. Prior
> to commit 1f18122 the guard at the start of execlists_schedule(prio <=
> rq->priotree.priority) is sufficient to avoid manipulating retired
> fences, and so we were avoiding this bug.
> -Chris