Re: [Intel-gfx] [PATCH] drm/i915: Remove instructions to file a bug report.
From: Chris Wilson
Date: Sat Dec 03 2016 - 03:58:49 EST
On Fri, Dec 02, 2016 at 05:03:05PM -0800, Matt Turner wrote:
> From these instructions, users assume that /sys/class/drm/card0/error
> contains all the information a developer needs to diagnose and fix a GPU
> hang.
>
> In fact it doesn't, and we have no tools for solving them (other than
> stabbing in the dark). Most of the time the error state itself isn't
> even useful because it just shows a hang on a PIPE_CONTROL or similar.
>
> Until a time when the error state contains enough information to
> actually solve a hang, stop telling users to file unsolvable bugs, and
> instead rely on users who know where and how to file a good bug report
> to find their own way there.
>
> Signed-off-by: Matt Turner <mattst88@xxxxxxxxx>
Nak. Though having stale bug reports is a pain, we've recently adopted
the policy of stopping the request after a certain period, those bug
reports are still vital. They don't just represent bugs in mesa.
> ---
> Maybe now's a good time to discuss what *would* be useful to put in the
> error state for debugging hangs. The currently executing shader program
> would be a great place to start.
Now? That is the conversation we've being trying to have for several
years. The contents of the error state are currently about sufficient to
spot kernel bugs, triage the culprit and the general class of bug.
Capturing all state for a request is unfeasible (because we can't copy
the gigabytes of memory required). Copying a selected set of aux bo is
one option. And since those bo are under user control and do not have to
be executed, you can even store aub data in them or whatnot.
Even if you make attaching the debug information conditional, I would
still keep the error message unconditional.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre